xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Kühne, Nikolai Lund; Østergaard, Jan; Jensen, Jesper; Tan, Zheng-Hua

Computer Science > Sound

arXiv:2501.06146 (cs)

[Submitted on 10 Jan 2025]

Title:xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Authors:Nikolai Lund Kühne, Jan Østergaard, Jesper Jensen, Zheng-Hua Tan

View PDF HTML (experimental)

Abstract:While attention-based architectures, such as Conformers, excel in speech enhancement, they face challenges such as scalability with respect to input sequence length. In contrast, the recently proposed Extended Long Short-Term Memory (xLSTM) architecture offers linear scalability. However, xLSTM-based models remain unexplored for speech enhancement. This paper introduces xLSTM-SENet, the first xLSTM-based single-channel speech enhancement system. A comparative analysis reveals that xLSTM-and notably, even LSTM-can match or outperform state-of-the-art Mamba- and Conformer-based systems across various model sizes in speech enhancement on the VoiceBank+Demand dataset. Through ablation studies, we identify key architectural design choices such as exponential gating and bidirectionality contributing to its effectiveness. Our best xLSTM-based model, xLSTM-SENet2, outperforms state-of-the-art Mamba- and Conformer-based systems on the Voicebank+DEMAND dataset.

Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2501.06146 [cs.SD]
	(or arXiv:2501.06146v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2501.06146

Submission history

From: Nikolai Lund Kühne [view email]
[v1] Fri, 10 Jan 2025 18:10:06 UTC (1,146 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.SD

< prev | next >

new | recent | 2025-01

Change to browse by:

cs
cs.AI
eess
eess.AS

References & Citations

export BibTeX citation

Computer Science > Sound

Title:xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators