Instance-aware multi-object self-supervision for monocular depth prediction

Boulahbal, Houssem; Voicila, Adrian; Comport, Andrew

doi:10.1109/LRA.2022.3194951

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.00809 (cs)

[Submitted on 2 Mar 2022 (v1), last revised 9 Aug 2022 (this version, v2)]

Title:Instance-aware multi-object self-supervision for monocular depth prediction

Authors:Houssem Boulahbal, Adrian Voicila, Andrew Comport

View PDF

Abstract:This paper proposes a self-supervised monocular image-to-depth prediction framework that is trained with an end-to-end photometric loss that handles not only 6-DOF camera motion but also 6-DOF moving object instances. Self-supervision is performed by warping the images across a video sequence using depth and scene motion including object instances. One novelty of the proposed method is the use of the multi-head attention of the transformer network that matches moving objects across time and models their interaction and dynamics. This enables accurate and robust pose estimation for each object instance. Most image-to-depth predication frameworks make the assumption of rigid scenes, which largely degrades their performance with respect to dynamic objects. Only a few SOTA papers have accounted for dynamic objects. The proposed method is shown to outperform these methods on standard benchmarks and the impact of the dynamic motion on these benchmarks is exposed. Furthermore, the proposed image-to-depth prediction framework is also shown to be competitive with SOTA video-to-depth prediction frameworks.

Comments:	IROS 2022 and RAL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.00809 [cs.CV]
	(or arXiv:2203.00809v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.00809
Related DOI:	https://doi.org/10.1109/LRA.2022.3194951

Submission history

From: Houssem Eddine Boulahbal [view email]
[v1] Wed, 2 Mar 2022 00:59:25 UTC (2,736 KB)
[v2] Tue, 9 Aug 2022 06:46:57 UTC (3,274 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Instance-aware multi-object self-supervision for monocular depth prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Instance-aware multi-object self-supervision for monocular depth prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators