Multi-Talker MVDR Beamforming Based on Extended Complex Gaussian Mixture Model

Chen, Hangting; Zhang, Pengyuan; Yan, Yonghong

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1910.07753 (eess)

[Submitted on 17 Oct 2019]

Title:Multi-Talker MVDR Beamforming Based on Extended Complex Gaussian Mixture Model

Authors:Hangting Chen, Pengyuan Zhang, Yonghong Yan

View PDF

Abstract:In this letter, we present a novel multi-talker minimum variance distortionless response (MVDR) beamforming as the front-end of an automatic speech recognition (ASR) system in a dinner party scenario. The CHiME-5 dataset is selected to evaluate our proposal for overlapping multi-talker scenario with severe noise. A detailed study on beamforming is conducted based on the proposed extended complex Gaussian mixture model (CGMM) integrated with various speech separation and speech enhancement masks. Three main changes are made to adopt the original CGMM-based MVDR for the multi-talker scenario. First, the number of Gaussian distributions is extended to 3 with an additional inference speaker model. Second, the mixture coefficients are introduced as a supervisor to generate more elaborate masks and avoid the permutation problems. Moreover, we reorganize the MVDR and mask-based speech separation to achieve both noise reduction and target speaker extraction. With the official baseline ASR back-end, our front-end algorithm gained an absolute WER reduction of 13.87% compared with the baseline front-end.

Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1910.07753 [eess.AS]
	(or arXiv:1910.07753v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1910.07753

Submission history

From: Hangting Chen [view email]
[v1] Thu, 17 Oct 2019 07:51:36 UTC (988 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Talker MVDR Beamforming Based on Extended Complex Gaussian Mixture Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Talker MVDR Beamforming Based on Extended Complex Gaussian Mixture Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators