Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

Tang, Haoyu; Liu, Zhaoyi; Zeng, Chang; Li, Xinfeng

Computer Science > Sound

arXiv:2303.13072 (cs)

[Submitted on 23 Mar 2023 (v1), last revised 5 Apr 2023 (this version, v2)]

Title:Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

Authors:Haoyu Tang, Zhaoyi Liu, Chang Zeng, Xinfeng Li

View PDF

Abstract:Transformer-based models have recently made significant achievements in the application of end-to-end (E2E) automatic speech recognition (ASR). It is possible to deploy the E2E ASR system on smart devices with the help of Transformer-based models. While these models still have the disadvantage of requiring a large number of model parameters. To overcome the drawback of universal Transformer models for the application of ASR on edge devices, we propose a solution that can reuse the block in Transformer models for the occasion of the small footprint ASR system, which meets the objective of accommodating resource limitations without compromising recognition accuracy. Specifically, we design a novel block-reusing strategy for speech Transformer (BRST) to enhance the effectiveness of parameters and propose an adapter module (ADM) that can produce a compact and adaptable model with only a few additional trainable parameters accompanying each reusing block. We conducted an experiment with the proposed method on the public AISHELL-1 corpus, and the results show that the proposed approach achieves the character error rate (CER) of 9.3%/6.63% with only 7.6M/8.3M parameters without and with the ADM, respectively. In addition, we also make a deeper analysis to show the effect of ADM in the general block-reusing method.

Subjects:	Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2303.13072 [cs.SD]
	(or arXiv:2303.13072v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2303.13072

Submission history

From: Chang Zeng [view email]
[v1] Thu, 23 Mar 2023 06:54:37 UTC (846 KB)
[v2] Wed, 5 Apr 2023 08:36:34 UTC (846 KB)

Computer Science > Sound

Title:Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators