Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Shen, Qiuhong; Wu, Zike; Yi, Xuanyu; Zhou, Pan; Zhang, Hanwang; Yan, Shuicheng; Wang, Xinchao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.18795v3 (cs)

[Submitted on 27 Mar 2024 (v1), last revised 24 May 2024 (this version, v3)]

Title:Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Authors:Qiuhong Shen, Zike Wu, Xuanyu Yi, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang

View PDF HTML (experimental)

Abstract:We tackle the challenge of efficiently reconstructing a 3D asset from a single image at millisecond speed. Existing methods for single-image 3D reconstruction are primarily based on Score Distillation Sampling (SDS) with Neural 3D representations. Despite promising results, these approaches encounter practical limitations due to lengthy optimizations and significant memory consumption. In this work, we introduce Gamba, an end-to-end 3D reconstruction model from a single-view image, emphasizing two main insights: (1) Efficient Backbone Design: introducing a Mamba-based GambaFormer network to model 3D Gaussian Splatting (3DGS) reconstruction as sequential prediction with linear scalability of token length, thereby accommodating a substantial number of Gaussians; (2) Robust Gaussian Constraints: deriving radial mask constraints from multi-view masks to eliminate the need for warmup supervision of 3D point clouds in training. We trained Gamba on Objaverse and assessed it against existing optimization-based and feed-forward 3D reconstruction approaches on the GSO Dataset, among which Gamba is the only end-to-end trained single-view reconstruction model with 3DGS. Experimental results demonstrate its competitive generation capabilities both qualitatively and quantitatively and highlight its remarkable speed: Gamba completes reconstruction within 0.05 seconds on a single NVIDIA A100 GPU, which is about $1,000\times$ faster than optimization-based methods. Please see our project page at this https URL.

Comments:	project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.18795 [cs.CV]
	(or arXiv:2403.18795v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.18795

Submission history

From: Qiuhong Shen [view email]
[v1] Wed, 27 Mar 2024 17:40:14 UTC (4,538 KB)
[v2] Fri, 29 Mar 2024 08:02:14 UTC (4,538 KB)
[v3] Fri, 24 May 2024 18:43:28 UTC (10,236 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators