i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Abeyruwan, Saminda; Graesser, Laura; D'Ambrosio, David B.; Singh, Avi; Shankar, Anish; Bewley, Alex; Sanketi, Pannag R.

Computer Science > Robotics

arXiv:2207.06572v2 (cs)

[Submitted on 14 Jul 2022 (v1), revised 12 Aug 2022 (this version, v2), latest version 22 Nov 2022 (v4)]

Title:i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Authors:Saminda Abeyruwan, Laura Graesser, David B. D'Ambrosio, Avi Singh, Anish Shankar, Alex Bewley, Pannag R. Sanketi

View PDF

Abstract:Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to train policies in simulation enables safe exploration and large-scale data collection quickly at low cost. However, prior works in sim-to-real transfer of robotic policies typically do not involve any human-robot interaction because accurately simulating human behavior is an open problem. In this work, our goal is to leverage the power of simulation to train robotic policies that are proficient at interacting with humans upon deployment. But there is a chicken and egg problem -- how do we gather examples of a human interacting with a physical robot so as to model human behavior in simulation without already having a robot that is able to interact with a human? Our proposed method, Iterative-Sim-to-Real (i-S2R), attempts to address this. i-S2R bootstraps from a simple model of human behavior and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined. We evaluate our method on a real world robotic table tennis setting, where the objective for the robot is to play cooperatively with a human player for as long as possible. Table tennis is a high-speed, dynamic task that requires the two players to react quickly to each other's moves, making a challenging test bed for research on human-robot interaction. We present results on an industrial robotic arm that is able to cooperatively play table tennis with human players, achieving rallies of 22 successive hits on average and 150 at best. Further, for 80% of players, rally lengths are 70% to 175% longer compared to the sim-to-real (S2R) baseline. For videos of our system in action, please see this https URL.

Comments:	8+21 pages
Subjects:	Robotics (cs.RO)
Cite as:	arXiv:2207.06572 [cs.RO]
	(or arXiv:2207.06572v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2207.06572

Submission history

From: Laura Graesser [view email]
[v1] Thu, 14 Jul 2022 00:26:45 UTC (3,660 KB)
[v2] Fri, 12 Aug 2022 17:15:00 UTC (3,660 KB)
[v3] Fri, 7 Oct 2022 14:59:46 UTC (3,660 KB)
[v4] Tue, 22 Nov 2022 02:05:12 UTC (3,847 KB)

Computer Science > Robotics

Title:i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators