An Efficient Transformer Decoder with Compressed Sub-layers

Li, Yanyang; Lin, Ye; Xiao, Tong; Zhu, Jingbo

Computer Science > Computation and Language

arXiv:2101.00542 (cs)

[Submitted on 3 Jan 2021 (v1), last revised 11 May 2023 (this version, v4)]

Title:An Efficient Transformer Decoder with Compressed Sub-layers

Authors:Yanyang Li, Ye Lin, Tong Xiao, Jingbo Zhu

View PDF

Abstract:The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness. But the high computation complexity of its decoder raises the inefficiency issue. By examining the mathematic formulation of the decoder, we show that under some mild conditions, the architecture could be simplified by compressing its sub-layers, the basic building block of Transformer, and achieves a higher parallelism. We thereby propose Compressed Attention Network, whose decoder layer consists of only one sub-layer instead of three. Extensive experiments on 14 WMT machine translation tasks show that our model is 1.42x faster with performance on par with a strong baseline. This strong baseline is already 2x faster than the widely used standard baseline without loss in performance.

Comments:	accepted by AAAI2021
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2101.00542 [cs.CL]
	(or arXiv:2101.00542v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2101.00542

Submission history

From: Ye Lin [view email]
[v1] Sun, 3 Jan 2021 02:05:01 UTC (40 KB)
[v2] Mon, 19 Jul 2021 06:30:58 UTC (41 KB)
[v3] Wed, 10 May 2023 07:31:16 UTC (237 KB)
[v4] Thu, 11 May 2023 08:30:53 UTC (237 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-01

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Tong Xiao
Jingbo Zhu

export BibTeX citation

Computer Science > Computation and Language

Title:An Efficient Transformer Decoder with Compressed Sub-layers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:An Efficient Transformer Decoder with Compressed Sub-layers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators