Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

Dorbala, Vishnu Sashank; Chowdhury, Sanjoy; Manocha, Dinesh

Computer Science > Robotics

arXiv:2403.11487 (cs)

[Submitted on 18 Mar 2024 (v1), last revised 2 Apr 2024 (this version, v3)]

Title:Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

Authors:Vishnu Sashank Dorbala, Sanjoy Chowdhury, Dinesh Manocha

View PDF HTML (experimental)

Abstract:We present a novel approach to automatically synthesize "wayfinding instructions" for an embodied robot agent. In contrast to prior approaches that are heavily reliant on human-annotated datasets designed exclusively for specific simulation platforms, our algorithm uses in-context learning to condition an LLM to generate instructions using just a few references. Using an LLM-based Visual Question Answering strategy, we gather detailed information about the environment which is used by the LLM for instruction synthesis. We implement our approach on multiple simulation platforms including Matterport3D, AI Habitat and ThreeDWorld, thereby demonstrating its platform-agnostic nature. We subjectively evaluate our approach via a user study and observe that 83.3% of users find the synthesized instructions accurately capture the details of the environment and show characteristics similar to those of human-generated instructions. Further, we conduct zero-shot navigation with multiple approaches on the REVERIE dataset using the generated instructions, and observe very close correlation with the baseline on standard success metrics (< 1% change in SR), quantifying the viability of generated instructions in replacing human-annotated data. We finally discuss the applicability of our approach in enabling a generalizable evaluation of embodied navigation policies. To the best of our knowledge, ours is the first LLM-driven approach capable of generating "human-like" instructions in a platform-agnostic manner, without training.

Comments:	14 Pages
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.11487 [cs.RO]
	(or arXiv:2403.11487v3 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2403.11487

Submission history

From: Vishnu Sashank Dorbala [view email]
[v1] Mon, 18 Mar 2024 05:38:07 UTC (9,664 KB)
[v2] Sun, 31 Mar 2024 22:21:18 UTC (9,669 KB)
[v3] Tue, 2 Apr 2024 04:27:55 UTC (9,669 KB)

Computer Science > Robotics

Title:Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators