Revisit Self-supervised Depth Estimation with Local Structure-from-Motion

Zhu, Shengjie; Liu, Xiaoming

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.19166 (cs)

[Submitted on 27 Jul 2024 (v1), last revised 6 Aug 2024 (this version, v2)]

Title:Revisit Self-supervised Depth Estimation with Local Structure-from-Motion

Authors:Shengjie Zhu, Xiaoming Liu

View PDF HTML (experimental)

Abstract:Both self-supervised depth estimation and Structure-from-Motion (SfM) recover scene depth from RGB videos. Despite sharing a similar objective, the two approaches are disconnected. Prior works of self-supervision backpropagate losses defined within immediate neighboring frames. Instead of learning-through-loss, this work proposes an alternative scheme by performing local SfM. First, with calibrated RGB or RGB-D images, we employ a depth and correspondence estimator to infer depthmaps and pair-wise correspondence maps. Then, a novel bundle-RANSAC-adjustment algorithm jointly optimizes camera poses and one depth adjustment for each depthmap. Finally, we fix camera poses and employ a NeRF, however, without a neural network, for dense triangulation and geometric verification. Poses, depth adjustments, and triangulated sparse depths are our outputs. For the first time, we show self-supervision within $5$ frames already benefits SoTA supervised depth and correspondence models. The project page is held in the link (this https URL).

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.19166 [cs.CV]
	(or arXiv:2407.19166v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.19166

Submission history

From: Shengjie Zhu [view email]
[v1] Sat, 27 Jul 2024 04:37:16 UTC (19,002 KB)
[v2] Tue, 6 Aug 2024 18:52:04 UTC (19,002 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Revisit Self-supervised Depth Estimation with Local Structure-from-Motion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Revisit Self-supervised Depth Estimation with Local Structure-from-Motion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators