Semi-supervised Active Learning for Video Action Detection

Singh, Ayush; Rana, Aayush J; Kumar, Akash; Vyas, Shruti; Rawat, Yogesh Singh

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.07169 (cs)

[Submitted on 12 Dec 2023 (v1), last revised 3 Apr 2024 (this version, v3)]

Title:Semi-supervised Active Learning for Video Action Detection

Authors:Ayush Singh, Aayush J Rana, Akash Kumar, Shruti Vyas, Yogesh Singh Rawat

View PDF HTML (experimental)

Abstract:In this work, we focus on label efficient learning for video action detection. We develop a novel semi-supervised active learning approach which utilizes both labeled as well as unlabeled data along with informative sample selection for action detection. Video action detection requires spatio-temporal localization along with classification, which poses several challenges for both active learning informative sample selection as well as semi-supervised learning pseudo label generation. First, we propose NoiseAug, a simple augmentation strategy which effectively selects informative samples for video action detection. Next, we propose fft-attention, a novel technique based on high-pass filtering which enables effective utilization of pseudo label for SSL in video action detection by emphasizing on relevant activity region within a video. We evaluate the proposed approach on three different benchmark datasets, UCF-101-24, JHMDB-21, and Youtube-VOS. First, we demonstrate its effectiveness on video action detection where the proposed approach outperforms prior works in semi-supervised and weakly-supervised learning along with several baseline approaches in both UCF101-24 and JHMDB-21. Next, we also show its effectiveness on Youtube-VOS for video object segmentation demonstrating its generalization capability for other dense prediction tasks in videos. The code and models is publicly available at: \url{this https URL}.

Comments:	AAAI Conference on Artificial Intelligence, Main Technical Track (AAAI), 2024, Code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.07169 [cs.CV]
	(or arXiv:2312.07169v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.07169

Submission history

From: Akash Kumar [view email]
[v1] Tue, 12 Dec 2023 11:13:17 UTC (12,358 KB)
[v2] Tue, 19 Mar 2024 19:12:26 UTC (12,357 KB)
[v3] Wed, 3 Apr 2024 15:11:33 UTC (12,357 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Semi-supervised Active Learning for Video Action Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Semi-supervised Active Learning for Video Action Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators