Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Gajdošech, Lukáš; Ali, Hassan; Habekost, Jan-Gerrit; Madaras, Martin; Kerzel, Matthias; Wermter, Stefan

Computer Science > Robotics

arXiv:2503.04308 (cs)

[Submitted on 6 Mar 2025]

Title:Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Authors:Lukáš Gajdošech, Hassan Ali, Jan-Gerrit Habekost, Martin Madaras, Matthias Kerzel, Stefan Wermter

View PDF HTML (experimental)

Abstract:Datasets for object detection often do not account for enough variety of glasses, due to their transparent and reflective properties. Specifically, open-vocabulary object detectors, widely used in embodied robotic agents, fail to distinguish subclasses of glasses. This scientific gap poses an issue to robotic applications that suffer from accumulating errors between detection, planning, and action execution. The paper introduces a novel method for the acquisition of real-world data from RGB-D sensors that minimizes human effort. We propose an auto-labeling pipeline that generates labels for all the acquired frames based on the depth measurements. We provide a novel real-world glass object dataset that was collected on the Neuro-Inspired COLlaborator (NICOL), a humanoid robot platform. The data set consists of 7850 images recorded from five different cameras. We show that our trained baseline model outperforms state-of-the-art open-vocabulary approaches. In addition, we deploy our baseline model in an embodied agent approach to the NICOL platform, on which it achieves a success rate of 81% in a human-robot bartending scenario.

Comments:	Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025
Subjects:	Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
MSC classes:	68T40
ACM classes:	I.2.9; I.4.8
Cite as:	arXiv:2503.04308 [cs.RO]
	(or arXiv:2503.04308v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2503.04308

Submission history

From: Lukáš Gajdošech [view email]
[v1] Thu, 6 Mar 2025 10:51:04 UTC (25,124 KB)

Computer Science > Robotics

Title:Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators