ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Shen, Bokui; Jiang, Zhenyu; Choy, Christopher; Guibas, Leonidas J.; Savarese, Silvio; Anandkumar, Anima; Zhu, Yuke

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.06856 (cs)

[Submitted on 14 Mar 2022 (v1), last revised 5 Aug 2022 (this version, v3)]

Title:ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Authors:Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

View PDF

Abstract:Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed to goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. Furthermore, we apply the simulation-trained ACID model directly to real-world objects and show success in manipulating them into target configurations. For more results and information, please visit this https URL .

Comments:	RSS 2022 Best Student Paper Award Finalist. Please check out more details at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2203.06856 [cs.CV]
	(or arXiv:2203.06856v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.06856
Journal reference:	Robotics: Science and Systems (RSS), 2022

Submission history

From: Bokui Shen [view email]
[v1] Mon, 14 Mar 2022 04:56:55 UTC (15,322 KB)
[v2] Mon, 2 May 2022 20:33:34 UTC (12,657 KB)
[v3] Fri, 5 Aug 2022 19:21:02 UTC (13,987 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators