Continuous 3D Perception Model with Persistent State

Wang, Qianqian; Zhang, Yifei; Holynski, Aleksander; Efros, Alexei A.; Kanazawa, Angjoo

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.12387 (cs)

[Submitted on 21 Jan 2025]

Title:Continuous 3D Perception Model with Persistent State

Authors:Qianqian Wang, Yifei Zhang, Aleksander Holynski, Alexei A. Efros, Angjoo Kanazawa

View PDF HTML (experimental)

Abstract:We present a unified framework capable of solving a broad range of 3D tasks. Our approach features a stateful recurrent model that continuously updates its state representation with each new observation. Given a stream of images, this evolving state can be used to generate metric-scale pointmaps (per-pixel 3D points) for each new input in an online fashion. These pointmaps reside within a common coordinate system, and can be accumulated into a coherent, dense scene reconstruction that updates as new images arrive. Our model, called CUT3R (Continuous Updating Transformer for 3D Reconstruction), captures rich priors of real-world scenes: not only can it predict accurate pointmaps from image observations, but it can also infer unseen regions of the scene by probing at virtual, unobserved views. Our method is simple yet highly flexible, naturally accepting varying lengths of images that may be either video streams or unordered photo collections, containing both static and dynamic content. We evaluate our method on various 3D/4D tasks and demonstrate competitive or state-of-the-art performance in each. Project Page: this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.12387 [cs.CV]
	(or arXiv:2501.12387v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.12387

Submission history

From: Qianqian Wang [view email]
[v1] Tue, 21 Jan 2025 18:59:23 UTC (14,156 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Continuous 3D Perception Model with Persistent State

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Continuous 3D Perception Model with Persistent State

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators