SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Pan, Jing; Lei, Tao; Kim, Kwangyoun; Han, Kyu; Watanabe, Shinji

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2110.05571 (eess)

[Submitted on 11 Oct 2021]

Title:SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Authors:Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe

View PDF

Abstract:The Transformer architecture has been well adopted as a dominant architecture in most sequence transduction tasks including automatic speech recognition (ASR), since its attention mechanism excels in capturing long-range dependencies. While models built solely upon attention can be better parallelized than regular RNN, a novel network architecture, SRU++, was recently proposed. By combining the fast recurrence and attention mechanism, SRU++ exhibits strong capability in sequence modeling and achieves near-state-of-the-art results in various language modeling and machine translation tasks with improved compute efficiency. In this work, we present the advantages of applying SRU++ in ASR tasks by comparing with Conformer across multiple ASR benchmarks and study how the benefits can be generalized to long-form speech inputs. On the popular LibriSpeech benchmark, our SRU++ model achieves 2.0% / 4.7% WER on test-clean / test-other, showing competitive performances compared with the state-of-the-art Conformer encoder under the same set-up. Specifically, SRU++ can surpass Conformer on long-form speech input with a large margin, based on our analysis.

Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
Cite as:	arXiv:2110.05571 [eess.AS]
	(or arXiv:2110.05571v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2110.05571

Submission history

From: Jing Pan [view email]
[v1] Mon, 11 Oct 2021 19:23:50 UTC (195 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators