Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation

Xie, Senwei; Wang, Hongyu; Xiao, Zhanqi; Wang, Ruiping; Chen, Xilin

Computer Science > Robotics

arXiv:2501.04268 (cs)

[Submitted on 8 Jan 2025]

Title:Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation

Authors:Senwei Xie, Hongyu Wang, Zhanqi Xiao, Ruiping Wang, Xilin Chen

View PDF HTML (experimental)

Abstract:Zero-shot generalization across various robots, tasks and environments remains a significant challenge in robotic manipulation. Policy code generation methods use executable code to connect high-level task descriptions and low-level action sequences, leveraging the generalization capabilities of large language models and atomic skill libraries. In this work, we propose Robotic Programmer (RoboPro), a robotic foundation model, enabling the capability of perceiving visual information and following free-form instructions to perform robotic manipulation with policy code in a zero-shot manner. To address low efficiency and high cost in collecting runtime code data for robotic tasks, we devise Video2Code to synthesize executable code from extensive videos in-the-wild with off-the-shelf vision-language model and code-domain large language model. Extensive experiments show that RoboPro achieves the state-of-the-art zero-shot performance on robotic manipulation in both simulators and real-world environments. Specifically, the zero-shot success rate of RoboPro on RLBench surpasses the state-of-the-art model GPT-4o by 11.6%, which is even comparable to a strong supervised training baseline. Furthermore, RoboPro is robust to variations on API formats and skill sets.

Subjects:	Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.04268 [cs.RO]
	(or arXiv:2501.04268v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2501.04268

Submission history

From: Senwei Xie [view email]
[v1] Wed, 8 Jan 2025 04:30:45 UTC (20,091 KB)

Computer Science > Robotics

Title:Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators