Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

Xia, Kun; Wang, Le; Zhou, Sanping; Zheng, Nanning; Tang, Wei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2206.11493 (cs)

[Submitted on 23 Jun 2022]

Title:Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

Authors:Kun Xia, Le Wang, Sanping Zhou, Nanning Zheng, Wei Tang

View PDF

Abstract:The main challenge of Temporal Action Localization is to retrieve subtle human actions from various co-occurring ingredients, e.g., context and background, in an untrimmed video. While prior approaches have achieved substantial progress through devising advanced action detectors, they still suffer from these co-occurring ingredients which often dominate the actual action content in videos. In this paper, we explore two orthogonal but complementary aspects of a video snippet, i.e., the action features and the co-occurrence features. Especially, we develop a novel auxiliary task by decoupling these two types of features within a video snippet and recombining them to generate a new feature representation with more salient action information for accurate action localization. We term our method RefactorNet, which first explicitly factorizes the action content and regularizes its co-occurrence features, and then synthesizes a new action-dominated video representation. Extensive experimental results and ablation studies on THUMOS14 and ActivityNet v1.3 demonstrate that our new representation, combined with a simple action detector, can significantly improve the action localization performance.

Comments:	Accepted by CVPR 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2206.11493 [cs.CV]
	(or arXiv:2206.11493v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.11493

Submission history

From: Kun Xia [view email]
[v1] Thu, 23 Jun 2022 06:30:08 UTC (1,847 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators