Learning Reinforced Attentional Representation for End-to-End Visual Tracking

Gao, Peng; Zhang, Qiquan; Xiao, Liyi; Zhang, Yan; Wang, Fei

Computer Science > Computer Vision and Pattern Recognition

arXiv:1908.10009v1 (cs)

[Submitted on 27 Aug 2019 (this version), latest version 2 Jan 2020 (v3)]

Title:Learning Reinforced Attentional Representation for End-to-End Visual Tracking

Authors:Peng Gao, Qiquan Zhang, Liyi Xiao, Yan Zhang, Fei Wang

View PDF

Abstract:Despite the fact that tremendous advances have been made by numerous recent tracking approaches in the last decade, how to achieve high-performance visual tracking is still an open problem. In this paper, we propose an end-to-end network model to learn reinforced attentional representation for accurate target object discrimination and localization. We utilize a novel hierarchical attentional module with long short-term memory and multi-layer perceptrons to leverage both inter- and intra-frame attention to effectively facilitate visual pattern emphasis. Moreover, we incorporate a contextual attentional correlation filter into the backbone network to make our model be trained in an end-to-end fashion. Our proposed approach not only takes full advantage of informative geometries and semantics, but also updates correlation filters online without the backbone network fine-tuning to enable adaptation of target appearance variations. Extensive experiments conducted on several popular benchmark datasets demonstrate the effectiveness and efficiency of our proposed approach while remaining computational efficiency.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:1908.10009 [cs.CV]
	(or arXiv:1908.10009v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1908.10009

Submission history

From: Peng Gao [view email]
[v1] Tue, 27 Aug 2019 03:55:17 UTC (1,472 KB)
[v2] Wed, 28 Aug 2019 00:39:16 UTC (1,541 KB)
[v3] Thu, 2 Jan 2020 01:07:09 UTC (1,546 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Reinforced Attentional Representation for End-to-End Visual Tracking

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Reinforced Attentional Representation for End-to-End Visual Tracking

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators