Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Kwok, Chin Yuen; Yip, Jia Qi; Chng, Eng Siong

doi:10.21437/Interspeech.2024-205

Computer Science > Computation and Language

arXiv:2407.03645 (cs)

[Submitted on 4 Jul 2024 (v1), last revised 27 Sep 2024 (this version, v3)]

Title:Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Authors:Chin Yuen Kwok, Jia Qi Yip, Eng Siong Chng

View PDF

Abstract:Continual Learning (CL) involves fine-tuning pre-trained models with new data while maintaining the performance on the pre-trained data. This is particularly relevant for expanding multilingual ASR (MASR) capabilities. However, existing CL methods, mainly designed for computer vision and reinforcement learning tasks, often yield sub-optimal results when directly applied to MASR. We hypothesise that this is because CL of the auto-regressive decoder in the MASR model is difficult. To verify this, we propose four optimizations on the decoder. They include decoder-layer gradient surgery, freezing unused token embeddings, suppressing output of newly added tokens, and learning rate re-scaling. Our experiments on adapting Whisper to 10 unseen languages from the Common Voice dataset demonstrate that these optimizations reduce the Average Word Error Rate (AWER) of pretrained languages from 14.2% to 12.4% compared with Experience Replay, without compromising the AWER of new languages.

Comments:	Proceedings of Interspeech
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2407.03645 [cs.CL]
	(or arXiv:2407.03645v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.03645
Related DOI:	https://doi.org/10.21437/Interspeech.2024-205

Submission history

From: Chin Yuen Kwok [view email]
[v1] Thu, 4 Jul 2024 05:35:47 UTC (1,759 KB)
[v2] Fri, 12 Jul 2024 03:07:04 UTC (1,748 KB)
[v3] Fri, 27 Sep 2024 05:27:45 UTC (1,748 KB)

Computer Science > Computation and Language

Title:Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators