A Novel Convolution and Attention Mechanism-based Model for 6D Object Pose Estimation

Du, Alexander; Zhu, Yingwu

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.01993 (cs)

[Submitted on 31 Dec 2024]

Title:A Novel Convolution and Attention Mechanism-based Model for 6D Object Pose Estimation

Authors:Alexander Du, Yingwu Zhu

View PDF HTML (experimental)

Abstract:Estimating 6D object poses from RGB images is challenging because the lack of depth information requires inferring a three dimensional structure from 2D projections. Traditional methods often rely on deep learning with grid based data structures but struggle to capture complex dependencies among extracted features. To overcome this, we introduce a graph based representation derived directly from images, where spatial temporal features of each pixel serve as nodes, and relationships between them are defined through node connectivity and spatial interactions. We also employ feature selection mechanisms that use spatial attention and self attention distillation, along with a Legendre convolution layer leveraging the orthogonality of Legendre polynomials for numerical stability. Experiments on the LINEMOD, Occluded LINEMOD, and YCB Video datasets demonstrate that our method outperforms nine existing approaches and achieves state of the art benchmark in object pose estimation.

Comments:	8 pages, 4 figures, 6 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2501.01993 [cs.CV]
	(or arXiv:2501.01993v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.01993

Submission history

From: Alexander Du [view email]
[v1] Tue, 31 Dec 2024 18:47:54 UTC (359 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Novel Convolution and Attention Mechanism-based Model for 6D Object Pose Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Novel Convolution and Attention Mechanism-based Model for 6D Object Pose Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators