Context-Enhanced Stereo Transformer

Guo, Weiyu; Li, Zhaoshuo; Yang, Yongkui; Wang, Zheng; Taylor, Russell H.; Unberath, Mathias; Yuille, Alan; Li, Yingwei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2210.11719 (cs)

[Submitted on 21 Oct 2022]

Title:Context-Enhanced Stereo Transformer

Authors:Weiyu Guo, Zhaoshuo Li, Yongkui Yang, Zheng Wang, Russell H. Taylor, Mathias Unberath, Alan Yuille, Yingwei Li

View PDF

Abstract:Stereo depth estimation is of great interest for computer vision research. However, existing methods struggles to generalize and predict reliably in hazardous regions, such as large uniform regions. To overcome these limitations, we propose Context Enhanced Path (CEP). CEP improves the generalization and robustness against common failure cases in existing solutions by capturing the long-range global information. We construct our stereo depth estimation model, Context Enhanced Stereo Transformer (CSTR), by plugging CEP into the state-of-the-art stereo depth estimation method Stereo Transformer. CSTR is examined on distinct public datasets, such as Scene Flow, Middlebury-2014, KITTI-2015, and MPI-Sintel. We find CSTR outperforms prior approaches by a large margin. For example, in the zero-shot synthetic-to-real setting, CSTR outperforms the best competing approaches on Middlebury-2014 dataset by 11%. Our extensive experiments demonstrate that the long-range information is critical for stereo matching task and CEP successfully captures such information.

Comments:	Accepted by ECCV2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.11719 [cs.CV]
	(or arXiv:2210.11719v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2210.11719

Submission history

From: Guo Weiyu [view email]
[v1] Fri, 21 Oct 2022 04:10:47 UTC (5,812 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Context-Enhanced Stereo Transformer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Context-Enhanced Stereo Transformer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators