Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments

Xu, Yifan; Kamat, Vineet; Menassa, Carol

Abstract:In assistive robotics serving people with disabilities (PWD), accurate place recognition in built environments is crucial to ensure that robots navigate and interact safely within diverse indoor spaces. Language interfaces, particularly those powered by Large Language Models (LLM) and Vision Language Models (VLM), hold significant promise in this context, as they can interpret visual scenes and correlate them with semantic information. However, such interfaces are also known for their hallucinated predictions. In addition, language instructions provided by humans can also be ambiguous and lack precise details about specific locations, objects, or actions, exacerbating the hallucination issue. In this work, we introduce Seeing with Partial Certainty (SwPC) - a framework designed to measure and align uncertainty in VLM-based place recognition, enabling the model to recognize when it lacks confidence and seek assistance when necessary. This framework is built on the theory of conformal prediction to provide statistical guarantees on place recognition while minimizing requests for human help in complex indoor environment settings. Through experiments on the widely used richly-annotated scene dataset Matterport3D, we show that SwPC significantly increases the success rate and decreases the amount of human intervention required relative to the prior art. SwPC can be utilized with any VLMs directly without requiring model fine-tuning, offering a promising, lightweight approach to uncertainty modeling that complements and scales alongside the expanding capabilities of foundational models.

Comments:	10 pages, 4 Figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.04947 [cs.CV]
	(or arXiv:2501.04947v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.04947

Computer Science > Computer Vision and Pattern Recognition

Title:Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators