OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

Cai, Dingding; Heikkilä, Janne; Rahtu, Esa

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.01072 (cs)

[Submitted on 2 Mar 2022 (v1), last revised 7 Apr 2022 (this version, v3)]

Title:OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

Authors:Dingding Cai, Janne Heikkilä, Esa Rahtu

View PDF

Abstract:This paper proposes a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask. Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning. We achieve this by decomposing the 6D pose into viewpoint, in-plane rotation around the camera optical axis and translation, and introducing novel lightweight modules for estimating each component in a cascaded manner. The resulting network contains less than 4M parameters while demonstrating excellent performance on the challenging T-LESS and Occluded LINEMOD datasets without any dataset-specific training. We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data.
The implementation and the pre-trained model will be made publicly available.

Comments:	CVPR 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.01072 [cs.CV]
	(or arXiv:2203.01072v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.01072

Submission history

From: Dingding Cai [view email]
[v1] Wed, 2 Mar 2022 12:51:33 UTC (17,014 KB)
[v2] Sun, 6 Mar 2022 13:38:13 UTC (15,782 KB)
[v3] Thu, 7 Apr 2022 18:35:18 UTC (16,047 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators