ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Tang, Tianqi; Deldari, Shohreh; Xue, Hao; De Melo, Celso; Salim, Flora D.

Computer Science > Artificial Intelligence

arXiv:2406.13123 (cs)

[Submitted on 19 Jun 2024 (v1), last revised 18 Oct 2024 (this version, v2)]

Title:ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Authors:Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

View PDF HTML (experimental)

Abstract:Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark, ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks. The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets. Additionally, we introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects. This framework addresses challenges including memory complexity from long video clips, natural language complexity from open queries, and text-video misalignment. We posit that ViLCo-Bench, with greater complexity compared to existing continual learning benchmarks, would serve as a critical tool for exploring the video-language domain, extending beyond conventional class-incremental tasks, and addressing complex and limited annotation issues. The curated data, evaluations, and our novel method are available at this https URL.

Comments:	14 pages, 4 figures, 8 tables, Accepted at NeurIPS Dataset and Benchmark Track 2024
Subjects:	Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.13123 [cs.AI]
	(or arXiv:2406.13123v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2406.13123

Submission history

From: Shohreh Deldari [view email]
[v1] Wed, 19 Jun 2024 00:38:19 UTC (1,847 KB)
[v2] Fri, 18 Oct 2024 05:20:34 UTC (1,740 KB)

Computer Science > Artificial Intelligence

Title:ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:ViLCo-Bench: VIdeo Language COntinual learning Benchmark

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators