Real-Time Hybrid Mapping of Populated Indoor Scenes using a Low-Cost Monocular UAV

Golodetz, Stuart; Vankadari, Madhu; Everitt, Aluna; Shin, Sangyun; Markham, Andrew; Trigoni, Niki

Abstract:Unmanned aerial vehicles (UAVs) have been used for many applications in recent years, from urban search and rescue, to agricultural surveying, to autonomous underground mine exploration. However, deploying UAVs in tight, indoor spaces, especially close to humans, remains a challenge. One solution, when limited payload is required, is to use micro-UAVs, which pose less risk to humans and typically cost less to replace after a crash. However, micro-UAVs can only carry a limited sensor suite, e.g. a monocular camera instead of a stereo pair or LiDAR, complicating tasks like dense mapping and markerless multi-person 3D human pose estimation, which are needed to operate in tight environments around people. Monocular approaches to such tasks exist, and dense monocular mapping approaches have been successfully deployed for UAV applications. However, despite many recent works on both marker-based and markerless multi-UAV single-person motion capture, markerless single-camera multi-person 3D human pose estimation remains a much earlier-stage technology, and we are not aware of existing attempts to deploy it in an aerial context. In this paper, we present what is thus, to our knowledge, the first system to perform simultaneous mapping and multi-person 3D human pose estimation from a monocular camera mounted on a single UAV. In particular, we show how to loosely couple state-of-the-art monocular depth estimation and monocular 3D human pose estimation approaches to reconstruct a hybrid map of a populated indoor scene in real time. We validate our component-level design choices via extensive experiments on the large-scale ScanNet and GTA-IM datasets. To evaluate our system-level performance, we also construct a new Oxford Hybrid Mapping dataset of populated indoor scenes.

Comments:	Submitted to IROS 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
MSC classes:	68T45
ACM classes:	I.2.10; I.2.9
Cite as:	arXiv:2203.02453 [cs.CV]
	(or arXiv:2203.02453v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.02453

Computer Science > Computer Vision and Pattern Recognition

Title:Real-Time Hybrid Mapping of Populated Indoor Scenes using a Low-Cost Monocular UAV

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators