Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization

Xu, Xun; Hospedales, Timothy; Gong, Shaogang

doi:10.1109/TCSVT.2016.2532719

Abstract:The growing rate of public space CCTV installations has generated a need for automated methods for exploiting video surveillance data including scene understanding, query, behaviour annotation and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes, or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality, and sharing any supervised annotations, between different scenes is however challenging due to: Some scenes are totally un-related -- and thus any information sharing between them would be detrimental; while others may only share a subset of common activities -- and thus information sharing is only useful if it is selective. Moreover, semantically similar activities which should be modelled together and shared across scenes may have quite different pixel-level appearance in each scene. To address these issues we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviours; and further discovers which subset of activities are shared versus scene-specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks including scene activity understanding, cross-scene query-by-example, behaviour classification with reduced supervised labelling requirements, and video summarization. In each case we demonstrate how our multi-scene model improves on a collection of standard single scene models and a flat model of all scenes.

Comments:	Multi-Scene Traffic Behaviour Analysis ---- Accepted at IEEE Transactions on Circuits and Systems for Video Technology
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1507.07458 [cs.CV]
	(or arXiv:1507.07458v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1507.07458
Related DOI:	https://doi.org/10.1109/TCSVT.2016.2532719

Computer Science > Computer Vision and Pattern Recognition

Title:Discovery of Shared Semantic Spaces for Multi-Scene Video Query and Summarization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators