KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Rezazadeh, Alireza; Choi, Changhyun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2202.09006 (cs)

[Submitted on 18 Feb 2022 (v1), last revised 5 Aug 2023 (this version, v3)]

Title:KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Authors:Alireza Rezazadeh, Changhyun Choi

View PDF

Abstract:Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) -- an end-to-end unsupervised framework to reason about object interactions based on a keypoint representation. Using visual observations, our model learns to associate objects with keypoint coordinates and discovers a graph representation of the system as a set of keypoint embeddings and their relations. It then learns an action-conditioned forward model using contrastive estimation to predict future keypoint states. By learning to perform physical reasoning in the keypoint space, our model automatically generalizes to scenarios with a different number of objects, novel backgrounds, and unseen object geometries. Experiments demonstrate the effectiveness of our model in accurately performing forward prediction and learning plannable object-centric representations for downstream robotic pushing manipulation tasks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:2202.09006 [cs.CV]
	(or arXiv:2202.09006v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2202.09006

Submission history

From: Alireza Rezazadeh [view email]
[v1] Fri, 18 Feb 2022 03:32:08 UTC (5,579 KB)
[v2] Mon, 19 Dec 2022 22:29:03 UTC (32,231 KB)
[v3] Sat, 5 Aug 2023 21:39:42 UTC (8,623 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators