A Benchmark for Crime Surveillance Video Analysis with Large Models

Chen, Haoran; Yi, Dong; Cao, Moyan; Huang, Chensen; Zhu, Guibo; Wang, Jinqiao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2502.09325 (cs)

[Submitted on 13 Feb 2025]

Title:A Benchmark for Crime Surveillance Video Analysis with Large Models

Authors:Haoran Chen, Dong Yi, Moyan Cao, Chensen Huang, Guibo Zhu, Jinqiao Wang

View PDF HTML (experimental)

Abstract:Anomaly analysis in surveillance videos is a crucial topic in computer vision. In recent years, multimodal large language models (MLLMs) have outperformed task-specific models in various domains. Although MLLMs are particularly versatile, their abilities to understand anomalous concepts and details are insufficiently studied because of the outdated benchmarks of this field not providing MLLM-style QAs and efficient algorithms to assess the model's open-ended text responses. To fill this gap, we propose a benchmark for crime surveillance video analysis with large models denoted as UCVL, including 1,829 videos and reorganized annotations from the UCF-Crime and UCF-Crime Annotation datasets. We design six types of questions and generate diverse QA pairs. Then we develop detailed instructions and use OpenAI's GPT-4o for accurate assessment. We benchmark eight prevailing MLLMs ranging from 0.5B to 40B parameters, and the results demonstrate the reliability of this bench. Moreover, we finetune LLaVA-OneVision on UCVL's training set. The improvement validates our data's high quality for video anomaly analysis.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2502.09325 [cs.CV]
	(or arXiv:2502.09325v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2502.09325

Submission history

From: Chensen Huang [view email]
[v1] Thu, 13 Feb 2025 13:38:17 UTC (2,087 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Benchmark for Crime Surveillance Video Analysis with Large Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Benchmark for Crime Surveillance Video Analysis with Large Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators