Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

Koizumi, Yuma; Yatabe, Kohei; Delcroix, Marc; Masuyama, Yoshiki; Takeuchi, Daiki

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2002.05873 (eess)

[Submitted on 14 Feb 2020]

Title:Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

Authors:Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi

View PDF

Abstract:This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural network (DNN)--based speech enhancement mainly focus on building a speaker independent model. Meanwhile, in speech applications including speech recognition and synthesis, it is known that model adaptation to the target speaker improves the accuracy. Our research question is whether a DNN for speech enhancement can be adopted to unknown speakers without any auxiliary guidance signal in test-phase. To achieve this, we adopt multi-task learning of speech enhancement and speaker identification, and use the output of the final hidden layer of speaker identification branch as an auxiliary feature. In addition, we use multi-head self-attention for capturing long-term dependencies in the speech and noise. Experimental results on a public dataset show that our strategy achieves the state-of-the-art performance and also outperform conventional methods in terms of subjective quality.

Comments:	5 pages, to appear in IEEE ICASSP 2020
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
Cite as:	arXiv:2002.05873 [eess.AS]
	(or arXiv:2002.05873v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2002.05873

Submission history

From: Yuma Koizumi [view email]
[v1] Fri, 14 Feb 2020 05:05:36 UTC (682 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators