Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Trinh, Trieu H.; Dai, Andrew M.; Luong, Minh-Thang; Le, Quoc V.

Computer Science > Machine Learning

arXiv:1803.00144v3 (cs)

[Submitted on 1 Mar 2018 (v1), last revised 13 Jun 2018 (this version, v3)]

Title:Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Authors:Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le

View PDF

Abstract:Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16\,000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation.

Comments:	ICML 2018
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1803.00144 [cs.LG]
	(or arXiv:1803.00144v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1803.00144

Submission history

From: Trieu Trinh [view email]
[v1] Thu, 1 Mar 2018 00:28:07 UTC (108 KB)
[v2] Fri, 1 Jun 2018 17:49:15 UTC (108 KB)
[v3] Wed, 13 Jun 2018 08:35:57 UTC (356 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-03

Change to browse by:

cs
cs.AI
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Trieu H. Trinh
Andrew M. Dai
Thang Luong
Quoc V. Le

export BibTeX citation

Computer Science > Machine Learning

Title:Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Longer-term Dependencies in RNNs with Auxiliary Losses

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators