Movies2Scenes: Learning Scene Representations Using Movie Similarities

Chen, Shixing; Hao, Xiang; Nie, Xiaohan; Hamid, Raffay

Computer Science > Computer Vision and Pattern Recognition

arXiv:2202.10650v1 (cs)

[Submitted on 22 Feb 2022 (this version), latest version 30 Mar 2023 (v3)]

Title:Movies2Scenes: Learning Scene Representations Using Movie Similarities

Authors:Shixing Chen, Xiang Hao, Xiaohan Nie, Raffay Hamid

View PDF

Abstract:Automatic understanding of movie-scenes is an important problem with multiple downstream applications including video-moderation, search and recommendation. The long-form nature of movies makes labeling of movie scenes a laborious task, which makes applying end-to-end supervised approaches for understanding movie-scenes a challenging problem. Directly applying state-of-the-art visual representations learned from large-scale image datasets for movie-scene understanding does not prove to be effective given the large gap between the two domains. To address these challenges, we propose a novel contrastive learning approach that uses commonly available sources of movie-information (e.g., genre, synopsis, more-like-this information) to learn a general-purpose scene-representation. Using a new dataset (MovieCL30K) with 30,340 movies, we demonstrate that our learned scene-representation surpasses existing state-of-the-art results on eleven downstream tasks from multiple datasets. To further show the effectiveness of our scene-representation, we introduce another new dataset (MCD) focused on large-scale video-moderation with 44,581 clips containing sex, violence, and drug-use activities covering 18,330 movies and TV episodes, and show strong gains over existing state-of-the-art approaches.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2202.10650 [cs.CV]
	(or arXiv:2202.10650v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2202.10650

Submission history

From: Shixing Chen [view email]
[v1] Tue, 22 Feb 2022 03:31:33 UTC (8,908 KB)
[v2] Sat, 12 Mar 2022 03:08:46 UTC (8,763 KB)
[v3] Thu, 30 Mar 2023 00:51:47 UTC (12,069 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Movies2Scenes: Learning Scene Representations Using Movie Similarities

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Movies2Scenes: Learning Scene Representations Using Movie Similarities

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators