YOLOv3 with Spatial Pyramid Pooling for Object Detection with Unmanned Aerial Vehicles

Pebrianto, Wahyu; Mudjirahardjo, Panca; Pramono, Sholeh Hadi; Rahmadwati; Setyawan, Raden Arief

Computer Science > Computer Vision and Pattern Recognition

arXiv:2305.12344 (cs)

[Submitted on 21 May 2023]

Title:YOLOv3 with Spatial Pyramid Pooling for Object Detection with Unmanned Aerial Vehicles

Authors:Wahyu Pebrianto, Panca Mudjirahardjo, Sholeh Hadi Pramono, Rahmadwati, Raden Arief Setyawan

View PDF

Abstract:Object detection with Unmanned Aerial Vehicles (UAVs) has attracted much attention in the research field of computer vision. However, not easy to accurately detect objects with data obtained from UAVs, which capture images from very high altitudes, making the image dominated by small object sizes, that difficult to detect. Motivated by that challenge, we aim to improve the performance of the one-stage detector YOLOv3 by adding a Spatial Pyramid Pooling (SPP) layer on the end of the backbone darknet-53 to obtain more efficient feature extraction process in object detection tasks with UAVs. We also conducted an evaluation study on different versions of YOLOv3 methods. Includes YOLOv3 with SPP, YOLOv3, and YOLOv3-tiny, which we analyzed with the VisDrone2019-Det dataset. Here we show that YOLOv3 with SPP can get results mAP 0.6% higher than YOLOv3 and 26.6% than YOLOv3-Tiny at 640x640 input scale and is even able to maintain accuracy at different input image scales than other versions of the YOLOv3 method. Those results prove that the addition of SPP layers to YOLOv3 can be an efficient solution for improving the performance of the object detection method with data obtained from UAVs.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2305.12344 [cs.CV]
	(or arXiv:2305.12344v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2305.12344

Submission history

From: Wahyu Pebrianto [view email]
[v1] Sun, 21 May 2023 04:41:52 UTC (1,283 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:YOLOv3 with Spatial Pyramid Pooling for Object Detection with Unmanned Aerial Vehicles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:YOLOv3 with Spatial Pyramid Pooling for Object Detection with Unmanned Aerial Vehicles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators