NeRF-DetS: Enhanced Adaptive Spatial-wise Sampling and View-wise Fusion Strategies for NeRF-based Indoor Multi-view 3D Object Detection

Huang, Chi; Li, Xinyang; Qu, Yansong; Wu, Changli; Li, Xiaofan; Zhang, Shengchuan; Cao, Liujuan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.13921 (cs)

[Submitted on 22 Apr 2024 (v1), last revised 30 Dec 2024 (this version, v2)]

Title:NeRF-DetS: Enhanced Adaptive Spatial-wise Sampling and View-wise Fusion Strategies for NeRF-based Indoor Multi-view 3D Object Detection

Authors:Chi Huang, Xinyang Li, Yansong Qu, Changli Wu, Xiaofan Li, Shengchuan Zhang, Liujuan Cao

View PDF HTML (experimental)

Abstract:In indoor scenes, the diverse distribution of object locations and scales makes the visual 3D perception task a big challenge.
Previous works (e.g, NeRF-Det) have demonstrated that implicit representation has the capacity to benefit the visual 3D perception task in indoor scenes with high amount of overlap between input images.
However, previous works cannot fully utilize the advancement of implicit representation because of fixed sampling and simple multi-view feature fusion.
In this paper, inspired by sparse fashion method (e.g, DETR3D), we propose a simple yet effective method, NeRF-DetS, to address above issues. NeRF-DetS includes two modules: Progressive Adaptive Sampling Strategy (PASS) and Depth-Guided Simplified Multi-Head Attention Fusion (DS-MHA).
Specifically,
(1)PASS can automatically sample features of each layer within a dense 3D detector, using offsets predicted by the previous layer.
(2)DS-MHA can not only efficiently fuse multi-view features with strong occlusion awareness but also reduce computational cost.
Extensive experiments on ScanNetV2 dataset demonstrate our NeRF-DetS outperforms NeRF-Det, by achieving +5.02% and +5.92% improvement in mAP under IoU25 and IoU50, respectively. Also, NeRF-DetS shows consistent improvements on ARKITScenes.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.13921 [cs.CV]
	(or arXiv:2404.13921v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.13921

Submission history

From: Chi Huang [view email]
[v1] Mon, 22 Apr 2024 06:59:03 UTC (8,558 KB)
[v2] Mon, 30 Dec 2024 13:26:37 UTC (13,991 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:NeRF-DetS: Enhanced Adaptive Spatial-wise Sampling and View-wise Fusion Strategies for NeRF-based Indoor Multi-view 3D Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:NeRF-DetS: Enhanced Adaptive Spatial-wise Sampling and View-wise Fusion Strategies for NeRF-based Indoor Multi-view 3D Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators