DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Tang, Peng; Zhu, Pengkai; Li, Tian; Appalaraju, Srikar; Mahadevan, Vijay; Manmatha, R.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.08623 (cs)

[Submitted on 15 Nov 2023]

Title:DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Authors:Peng Tang, Pengkai Zhu, Tian Li, Srikar Appalaraju, Vijay Mahadevan, R. Manmatha

View PDF

Abstract:Encoder-decoder transformer models have achieved great success on various vision-language (VL) tasks, but they suffer from high inference latency. Typically, the decoder takes up most of the latency because of the auto-regressive decoding. To accelerate the inference, we propose an approach of performing Dynamic Early Exit on Decoder (DEED). We build a multi-exit encoder-decoder transformer model which is trained with deep supervision so that each of its decoder layers is capable of generating plausible predictions. In addition, we leverage simple yet practical techniques, including shared generation head and adaptation modules, to keep accuracy when exiting at shallow decoder layers. Based on the multi-exit model, we perform step-level dynamic early exit during inference, where the model may decide to use fewer decoder layers based on its confidence of the current layer at each individual decoding step. Considering different number of decoder layers may be used at different decoding steps, we compute deeper-layer decoder features of previous decoding steps just-in-time, which ensures the features from different decoding steps are semantically aligned. We evaluate our approach with two state-of-the-art encoder-decoder transformer models on various VL tasks. We show our approach can reduce overall inference latency by 30%-60% with comparable or even higher accuracy compared to baselines.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2311.08623 [cs.CV]
	(or arXiv:2311.08623v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.08623

Submission history

From: Peng Tang [view email]
[v1] Wed, 15 Nov 2023 01:01:02 UTC (649 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators