Bidirectional Long-Short Term Memory for Video Description

Bin, Yi; Yang, Yang; Huang, Zi; Shen, Fumin; Xu, Xing; Shen, Heng Tao

Computer Science > Multimedia

arXiv:1606.04631 (cs)

[Submitted on 15 Jun 2016]

Title:Bidirectional Long-Short Term Memory for Video Description

Authors:Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen

View PDF

Abstract:Video captioning has been attracting broad research attention in multimedia community. However, most existing approaches either ignore temporal information among video frames or just employ local contextual temporal knowledge. In this work, we propose a novel video captioning framework, termed as \emph{Bidirectional Long-Short Term Memory} (BiLSTM), which deeply captures bidirectional global temporal structure in video. Specifically, we first devise a joint visual modelling approach to encode video data by combining a forward LSTM pass, a backward LSTM pass, together with visual features from Convolutional Neural Networks (CNNs). Then, we inject the derived video representation into the subsequent language model for initialization. The benefits are in two folds: 1) comprehensively preserving sequential and visual information; and 2) adaptively learning dense visual features and sparse semantic representations for videos and sentences, respectively. We verify the effectiveness of our proposed video captioning framework on a commonly-used benchmark, i.e., Microsoft Video Description (MSVD) corpus, and the experimental results demonstrate that the superiority of the proposed approach as compared to several state-of-the-art methods.

Comments:	5 pages
Subjects:	Multimedia (cs.MM); Computation and Language (cs.CL)
Cite as:	arXiv:1606.04631 [cs.MM]
	(or arXiv:1606.04631v1 [cs.MM] for this version)
	https://doi.org/10.48550/arXiv.1606.04631

Submission history

From: Fumin Shen Dr. [view email]
[v1] Wed, 15 Jun 2016 03:26:53 UTC (225 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.MM

< prev | next >

new | recent | 2016-06

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yi Bin
Yang Yang
Zi Huang
Fumin Shen
Xing Xu

…

export BibTeX citation

Computer Science > Multimedia

Title:Bidirectional Long-Short Term Memory for Video Description

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Multimedia

Title:Bidirectional Long-Short Term Memory for Video Description

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators