GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Wang, Xingrui; Lan, Cuiling; Zhu, Hanxin; Chen, Zhibo; Lu, Yan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.16932 (cs)

[Submitted on 22 Dec 2024]

Title:GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Authors:Xingrui Wang, Cuiling Lan, Hanxin Zhu, Zhibo Chen, Yan Lu

View PDF HTML (experimental)

Abstract:Modeling and understanding the 3D world is crucial for various applications, from augmented reality to robotic navigation. Recent advancements based on 3D Gaussian Splatting have integrated semantic information from multi-view images into Gaussian primitives. However, these methods typically require costly per-scene optimization from dense calibrated images, limiting their practicality. In this paper, we consider the new task of generalizable 3D semantic field modeling from sparse, uncalibrated image pairs. Building upon the Splatt3R architecture, we introduce GSemSplat, a framework that learns open-vocabulary semantic representations linked to 3D Gaussians without the need for per-scene optimization, dense image collections or calibration. To ensure effective and reliable learning of semantic features in 3D space, we employ a dual-feature approach that leverages both region-specific and context-aware semantic features as supervision in the 2D space. This allows us to capitalize on their complementary strengths. Experimental results on the ScanNet++ dataset demonstrate the effectiveness and superiority of our approach compared to the traditional scene-specific method. We hope our work will inspire more research into generalizable 3D understanding.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.16932 [cs.CV]
	(or arXiv:2412.16932v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.16932

Submission history

From: Xingrui Wang [view email]
[v1] Sun, 22 Dec 2024 09:06:58 UTC (2,210 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators