Video Abnormal Event Detection by Learning to Complete Visual Cloze Tests

Wang, Siqi; Yu, Guang; Cai, Zhiping; Liu, Xinwang; Zhu, En; Yin, Jianping

Abstract:Although deep neural networks (DNNs) enable great progress in video abnormal event detection (VAD), existing solutions typically suffer from two issues: (1) The localization of video events cannot be both precious and comprehensive. (2) The semantics and temporal context are under-explored. To tackle those issues, we are motivated by the prevalent cloze test in education and propose a novel approach named Visual Cloze Completion (VCC), which conducts VAD by learning to complete "visual cloze tests" (VCTs). Specifically, VCC first localizes each video event and encloses it into a spatio-temporal cube (STC). To achieve both precise and comprehensive localization, appearance and motion are used as complementary cues to mark the object region associated with each event. For each marked region, a normalized patch sequence is extracted from current and adjacent frames and stacked into a STC. With each patch and the patch sequence of a STC compared to a visual "word" and "sentence" respectively, we deliberately erase a certain "word" (patch) to yield a VCT. Then, the VCT is completed by training DNNs to infer the erased patch and its optical flow via video semantics. Meanwhile, VCC fully exploits temporal context by alternatively erasing each patch in temporal context and creating multiple VCTs. Furthermore, we propose localization-level, event-level, model-level and decision-level solutions to enhance VCC, which can further exploit VCC's potential and produce significant performance improvement gain. Extensive experiments demonstrate that VCC achieves state-of-the-art VAD performance. Our codes and results are open at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2108.02356 [cs.CV]
	(or arXiv:2108.02356v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2108.02356

Computer Science > Computer Vision and Pattern Recognition

Title:Video Abnormal Event Detection by Learning to Complete Visual Cloze Tests

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators