Distilling 3D distinctive local descriptors for 6D pose estimation

Hamza, Amir; Caraffa, Andrea; Boscaini, Davide; Poiesi, Fabio

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.15106v1 (cs)

[Submitted on 19 Mar 2025 (this version), latest version 20 Mar 2025 (v2)]

Title:Distilling 3D distinctive local descriptors for 6D pose estimation

Authors:Amir Hamza, Andrea Caraffa, Davide Boscaini, Fabio Poiesi

View PDF HTML (experimental)

Abstract:Three-dimensional local descriptors are crucial for encoding geometric surface properties, making them essential for various point cloud understanding tasks. Among these descriptors, GeDi has demonstrated strong zero-shot 6D pose estimation capabilities but remains computationally impractical for real-world applications due to its expensive inference process. \textit{Can we retain GeDi's effectiveness while significantly improving its efficiency?} In this paper, we explore this question by introducing a knowledge distillation framework that trains an efficient student model to regress local descriptors from a GeDi teacher. Our key contributions include: an efficient large-scale training procedure that ensures robustness to occlusions and partial observations while operating under compute and storage constraints, and a novel loss formulation that handles weak supervision from non-distinctive teacher descriptors. We validate our approach on five BOP Benchmark datasets and demonstrate a significant reduction in inference time while maintaining competitive performance with existing methods, bringing zero-shot 6D pose estimation closer to real-time feasibility. Project Website: this https URL

Comments:	Project Website: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.15106 [cs.CV]
	(or arXiv:2503.15106v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.15106

Submission history

From: Amir Hamza [view email]
[v1] Wed, 19 Mar 2025 11:04:37 UTC (11,121 KB)
[v2] Thu, 20 Mar 2025 08:27:13 UTC (11,121 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Distilling 3D distinctive local descriptors for 6D pose estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Distilling 3D distinctive local descriptors for 6D pose estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators