HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

Yoon, Ji Won; Woo, Beom Jun; Kim, Nam Soo

Computer Science > Computation and Language

arXiv:2204.06328 (cs)

[Submitted on 13 Apr 2022 (v1), last revised 19 Jun 2024 (this version, v2)]

Title:HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

Authors:Ji Won Yoon, Beom Jun Woo, Nam Soo Kim

View PDF HTML (experimental)

Abstract:Pre-training with self-supervised models, such as Hidden-unit BERT (HuBERT) and wav2vec 2.0, has brought significant improvements in automatic speech recognition (ASR). However, these models usually require an expensive computational cost to achieve outstanding performance, slowing down the inference speed. To improve the model efficiency, we introduce an early exit scheme for ASR, namely HuBERT-EE, that allows the model to stop the inference dynamically. In HuBERT-EE, multiple early exit branches are added at the intermediate layers. When the intermediate prediction of the early exit branch is confident, the model stops the inference, and the corresponding result can be returned early. We investigate the proper early exiting criterion and fine-tuning strategy to effectively perform early exiting. Experimental results on the LibriSpeech show that HuBERT-EE can accelerate the inference of the HuBERT while simultaneously balancing the trade-off between the performance and the latency.

Comments:	Accepted by INTERSPEECH 2024
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2204.06328 [cs.CL]
	(or arXiv:2204.06328v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2204.06328

Submission history

From: Ji Won Yoon [view email]
[v1] Wed, 13 Apr 2022 12:11:44 UTC (1,276 KB)
[v2] Wed, 19 Jun 2024 16:39:15 UTC (1,252 KB)

Computer Science > Computation and Language

Title:HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators