Exploring Relations in Untrimmed Videos for Self-Supervised Learning

Luo, Dezhao; Fang, Bo; Zhou, Yu; Zhou, Yucan; Wu, Dayan; Wang, Weiping

Computer Science > Computer Vision and Pattern Recognition

arXiv:2008.02711 (cs)

[Submitted on 6 Aug 2020]

Title:Exploring Relations in Untrimmed Videos for Self-Supervised Learning

Authors:Dezhao Luo, Bo Fang, Yu Zhou, Yucan Zhou, Dayan Wu, Weiping Wang

View PDF

Abstract:Existing video self-supervised learning methods mainly rely on trimmed videos for model training. However, trimmed datasets are manually annotated from untrimmed videos. In this sense, these methods are not really self-supervised. In this paper, we propose a novel self-supervised method, referred to as Exploring Relations in Untrimmed Videos (ERUV), which can be straightforwardly applied to untrimmed videos (real unlabeled) to learn spatio-temporal features. ERUV first generates single-shot videos by shot change detection. Then a designed sampling strategy is used to model relations for video clips. The strategy is saved as our self-supervision signals. Finally, the network learns representations by predicting the category of relations between the video clips. ERUV is able to compare the differences and similarities of videos, which is also an essential procedure for action and video related tasks. We validate our learned models with action recognition and video retrieval tasks with three kinds of 3D CNNs. Experimental results show that ERUV is able to learn richer representations and it outperforms state-of-the-art self-supervised methods with significant margins.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2008.02711 [cs.CV]
	(or arXiv:2008.02711v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2008.02711

Submission history

From: Dezhao Luo [view email]
[v1] Thu, 6 Aug 2020 15:29:25 UTC (5,006 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring Relations in Untrimmed Videos for Self-Supervised Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring Relations in Untrimmed Videos for Self-Supervised Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators