What You See is What You Get: Exploiting Visibility for 3D Object Detection

Hu, Peiyun; Ziglar, Jason; Held, David; Ramanan, Deva

Computer Science > Computer Vision and Pattern Recognition

arXiv:1912.04986 (cs)

[Submitted on 10 Dec 2019 (v1), last revised 21 Dec 2020 (this version, v3)]

Title:What You See is What You Get: Exploiting Visibility for 3D Object Detection

Authors:Peiyun Hu, Jason Ziglar, David Held, Deva Ramanan

View PDF

Abstract:Recent advances in 3D sensing have created unique challenges for computer vision. One fundamental challenge is finding a good representation for 3D sensor data. Most popular representations (such as PointNet) are proposed in the context of processing truly 3D data (e.g. points sampled from mesh models), ignoring the fact that 3D sensored data such as a LiDAR sweep is in fact 2.5D. We argue that representing 2.5D data as collections of (x, y, z) points fundamentally destroys hidden information about freespace. In this paper, we demonstrate such knowledge can be efficiently recovered through 3D raycasting and readily incorporated into batch-based gradient learning. We describe a simple approach to augmenting voxel-based networks with visibility: we add a voxelized visibility map as an additional input stream. In addition, we show that visibility can be combined with two crucial modifications common to state-of-the-art 3D detectors: synthetic data augmentation of virtual objects and temporal aggregation of LiDAR sweeps over multiple time frames. On the NuScenes 3D detection benchmark, we show that, by adding an additional stream for visibility input, we can significantly improve the overall detection accuracy of a state-of-the-art 3D detector.

Comments:	CVPR'20. More at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:1912.04986 [cs.CV]
	(or arXiv:1912.04986v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1912.04986

Submission history

From: Peiyun Hu [view email]
[v1] Tue, 10 Dec 2019 21:15:37 UTC (7,569 KB)
[v2] Tue, 31 Mar 2020 00:24:14 UTC (7,862 KB)
[v3] Mon, 21 Dec 2020 22:42:25 UTC (7,862 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:What You See is What You Get: Exploiting Visibility for 3D Object Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:What You See is What You Get: Exploiting Visibility for 3D Object Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators