Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation

Patravali, Jay; Mittal, Gaurav; Yu, Ye; Li, Fuxin; Chen, Mei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2109.15317 (cs)

[Submitted on 30 Sep 2021 (v1), last revised 11 Oct 2021 (this version, v2)]

Title:Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation

Authors:Jay Patravali, Gaurav Mittal, Ye Yu, Fuxin Li, Mei Chen

View PDF

Abstract:We present MetaUVFS as the first Unsupervised Meta-learning algorithm for Video Few-Shot action recognition. MetaUVFS leverages over 550K unlabeled videos to train a two-stream 2D and 3D CNN architecture via contrastive learning to capture the appearance-specific spatial and action-specific spatio-temporal video features respectively. MetaUVFS comprises a novel Action-Appearance Aligned Meta-adaptation (A3M) module that learns to focus on the action-oriented video features in relation to the appearance features via explicit few-shot episodic meta-learning over unsupervised hard-mined episodes. Our action-appearance alignment and explicit few-shot learner conditions the unsupervised training to mimic the downstream few-shot task, enabling MetaUVFS to significantly outperform all unsupervised methods on few-shot benchmarks. Moreover, unlike previous few-shot action recognition methods that are supervised, MetaUVFS needs neither base-class labels nor a supervised pretrained backbone. Thus, we need to train MetaUVFS just once to perform competitively or sometimes even outperform state-of-the-art supervised methods on popular HMDB51, UCF101, and Kinetics100 few-shot datasets.

Comments:	ICCV 2021 (Oral)
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2109.15317 [cs.CV]
	(or arXiv:2109.15317v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2109.15317

Submission history

From: Gaurav Mittal [view email]
[v1] Thu, 30 Sep 2021 17:59:17 UTC (4,554 KB)
[v2] Mon, 11 Oct 2021 05:44:32 UTC (4,554 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators