Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos

Qu, Xiaoye; Tang, Pengwei; Zhou, Zhikang; Cheng, Yu; Dong, Jianfeng; Zhou, Pan

Abstract:Temporal language localization in videos aims to ground one video segment in an untrimmed video based on a given sentence query. To tackle this task, designing an effective model to extract ground-ing information from both visual and textual modalities is crucial. However, most previous attempts in this field only focus on unidirectional interactions from video to query, which emphasizes which words to listen and attends to sentence information via vanilla soft attention, but clues from query-by-video interactions implying where to look are not taken into consideration. In this paper, we propose a Fine-grained Iterative Attention Network (FIAN) that consists of an iterative attention module for bilateral query-video in-formation extraction. Specifically, in the iterative attention module, each word in the query is first enhanced by attending to each frame in the video through fine-grained attention, then video iteratively attends to the integrated query. Finally, both video and query information is utilized to provide robust cross-modal representation for further moment localization. In addition, to better predict the target segment, we propose a content-oriented localization strategy instead of applying recent anchor-based localization. We evaluate the proposed method on three challenging public benchmarks: Ac-tivityNet Captions, TACoS, and Charades-STA. FIAN significantly outperforms the state-of-the-art approaches.

Comments:	ACM MM 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2008.02448 [cs.CV]
	(or arXiv:2008.02448v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2008.02448

Computer Science > Computer Vision and Pattern Recognition

Title:Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators