FocalPose++: Focal Length and Object Pose Estimation via Render and Compare

Cífka, Martin; Ponimatkin, Georgy; Labbé, Yann; Russell, Bryan; Aubry, Mathieu; Petrik, Vladimir; Sivic, Josef

doi:10.1109/TPAMI.2024.3475638

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.02985 (cs)

[Submitted on 15 Nov 2023 (v1), last revised 6 Nov 2024 (this version, v2)]

Title:FocalPose++: Focal Length and Object Pose Estimation via Render and Compare

Authors:Martin Cífka, Georgy Ponimatkin, Yann Labbé, Bryan Russell, Mathieu Aubry, Vladimir Petrik, Josef Sivic

View PDF HTML (experimental)

Abstract:We introduce FocalPose++, a neural render-and-compare method for jointly estimating the camera-object 6D pose and camera focal length given a single RGB input image depicting a known object. The contributions of this work are threefold. First, we derive a focal length update rule that extends an existing state-of-the-art render-and-compare 6D pose estimator to address the joint estimation task. Second, we investigate several different loss functions for jointly estimating the object pose and focal length. We find that a combination of direct focal length regression with a reprojection loss disentangling the contribution of translation, rotation, and focal length leads to improved results. Third, we explore the effect of different synthetic training data on the performance of our method. Specifically, we investigate different distributions used for sampling object's 6D pose and camera's focal length when rendering the synthetic images, and show that parametric distribution fitted on real training data works the best. We show results on three challenging benchmark datasets that depict known 3D models in uncontrolled settings. We demonstrate that our focal length and 6D pose estimates have lower error than the existing state-of-the-art methods.

Comments:	25 pages, 22 figures. IEEE TPAMI, 2024. Extended version of the conference paper arXiv:2204.05145
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.02985 [cs.CV]
	(or arXiv:2312.02985v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.02985
Related DOI:	https://doi.org/10.1109/TPAMI.2024.3475638

Submission history

From: Martin Cífka [view email]
[v1] Wed, 15 Nov 2023 13:28:02 UTC (14,636 KB)
[v2] Wed, 6 Nov 2024 23:02:02 UTC (43,311 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FocalPose++: Focal Length and Object Pose Estimation via Render and Compare

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FocalPose++: Focal Length and Object Pose Estimation via Render and Compare

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators