CASAPose: Class-Adaptive and Semantic-Aware Multi-Object Pose Estimation

Gard, Niklas; Hilsmann, Anna; Eisert, Peter

Computer Science > Computer Vision and Pattern Recognition

arXiv:2210.05318 (cs)

[Submitted on 11 Oct 2022 (v1), last revised 9 Dec 2022 (this version, v3)]

Title:CASAPose: Class-Adaptive and Semantic-Aware Multi-Object Pose Estimation

Authors:Niklas Gard, Anna Hilsmann, Peter Eisert

View PDF

Abstract:Applications in the field of augmented reality or robotics often require joint localisation and 6D pose estimation of multiple objects. However, most algorithms need one network per object class to be trained in order to provide the best results. Analysing all visible objects demands multiple inferences, which is memory and time-consuming. We present a new single-stage architecture called CASAPose that determines 2D-3D correspondences for pose estimation of multiple different objects in RGB images in one pass. It is fast and memory efficient, and achieves high accuracy for multiple objects by exploiting the output of a semantic segmentation decoder as control input to a keypoint recognition decoder via local class-adaptive normalisation. Our new differentiable regression of keypoint locations significantly contributes to a faster closing of the domain gap between real test and synthetic training data. We apply segmentation-aware convolutions and upsampling operations to increase the focus inside the object mask and to reduce mutual interference of occluding objects. For each inserted object, the network grows by only one output segmentation map and a negligible number of parameters. We outperform state-of-the-art approaches in challenging multi-object scenes with inter-object occlusion and synthetic training.

Comments:	BMVC 2022, camera-ready version (this submission includes the paper and supplementary material)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2210.05318 [cs.CV]
	(or arXiv:2210.05318v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2210.05318

Submission history

From: Niklas Gard [view email]
[v1] Tue, 11 Oct 2022 10:20:01 UTC (6,774 KB)
[v2] Mon, 17 Oct 2022 13:40:46 UTC (6,460 KB)
[v3] Fri, 9 Dec 2022 09:50:43 UTC (6,055 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CASAPose: Class-Adaptive and Semantic-Aware Multi-Object Pose Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CASAPose: Class-Adaptive and Semantic-Aware Multi-Object Pose Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators