A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

Liu, Hou-I; Tseng, Yu-Wen; Chang, Kai-Cheng; Wang, Pin-Jyun; Shuai, Hong-Han; Cheng, Wen-Huang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.05755 (cs)

[Submitted on 9 Jun 2024 (v1), last revised 15 Jun 2024 (this version, v3)]

Title:A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

Authors:Hou-I Liu, Yu-Wen Tseng, Kai-Cheng Chang, Pin-Jyun Wang, Hong-Han Shuai, Wen-Huang Cheng

View PDF HTML (experimental)

Abstract:Despite notable advancements in the field of computer vision, the precise detection of tiny objects continues to pose a significant challenge, largely owing to the minuscule pixel representation allocated to these objects in imagery data. This challenge resonates profoundly in the domain of geoscience and remote sensing, where high-fidelity detection of tiny objects can facilitate a myriad of applications ranging from urban planning to environmental monitoring. In this paper, we propose a new framework, namely, DeNoising FPN with Trans R-CNN (DNTR), to improve the performance of tiny object detection. DNTR consists of an easy plug-in design, DeNoising FPN (DN-FPN), and an effective Transformer-based detector, Trans R-CNN. Specifically, feature fusion in the feature pyramid network is important for detecting multiscale objects. However, noisy features may be produced during the fusion process since there is no regularization between the features of different scales. Therefore, we introduce a DN-FPN module that utilizes contrastive learning to suppress noise in each level's features in the top-down path of FPN. Second, based on the two-stage framework, we replace the obsolete R-CNN detector with a novel Trans R-CNN detector to focus on the representation of tiny objects with self-attention. Experimental results manifest that our DNTR outperforms the baselines by at least 17.4% in terms of APvt on the AI-TOD dataset and 9.6% in terms of AP on the VisDrone dataset, respectively. Our code will be available at this https URL.

Comments:	The article is accepted by IEEE Transactions on Geoscience and Remote Sensing. Our code will be available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.05755 [cs.CV]
	(or arXiv:2406.05755v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.05755

Submission history

From: Hou-I Liu [view email]
[v1] Sun, 9 Jun 2024 12:18:15 UTC (8,011 KB)
[v2] Tue, 11 Jun 2024 07:50:33 UTC (8,011 KB)
[v3] Sat, 15 Jun 2024 11:26:14 UTC (8,011 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators