PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Chen, Runnan; Wang, Zhaoqing; Wang, Jiepeng; Ma, Yuexin; Gong, Mingming; Wang, Wenping; Liu, Tongliang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.00352 (cs)

[Submitted on 31 Dec 2024]

Title:PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Authors:Runnan Chen, Zhaoqing Wang, Jiepeng Wang, Yuexin Ma, Mingming Gong, Wenping Wang, Tongliang Liu

View PDF HTML (experimental)

Abstract:Understanding geometric, semantic, and instance information in 3D scenes from sequential video data is essential for applications in robotics and augmented reality. However, existing Simultaneous Localization and Mapping (SLAM) methods generally focus on either geometric or semantic reconstruction. In this paper, we introduce PanoSLAM, the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework. Our approach builds upon 3D Gaussian Splatting, modified with several critical components to enable efficient rendering of depth, color, semantic, and instance information from arbitrary viewpoints. To achieve panoptic 3D scene reconstruction from sequential RGB-D videos, we propose an online Spatial-Temporal Lifting (STL) module that transfers 2D panoptic predictions from vision models into 3D Gaussian representations. This STL module addresses the challenges of label noise and inconsistencies in 2D predictions by refining the pseudo labels across multi-view inputs, creating a coherent 3D representation that enhances segmentation accuracy. Our experiments show that PanoSLAM outperforms recent semantic SLAM methods in both mapping and tracking accuracy. For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video. (this https URL)

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2501.00352 [cs.CV]
	(or arXiv:2501.00352v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.00352

Submission history

From: Runnan Chen Dr. [view email]
[v1] Tue, 31 Dec 2024 08:58:10 UTC (13,368 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators