Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

Kinoshita, Keisuke; Delcroix, Marc; Araki, Shoko; Nakatani, Tomohiro

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2003.03987 (eess)

[Submitted on 9 Mar 2020]

Title:Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

Authors:Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani

View PDF

Abstract:Automatic meeting analysis is an essential fundamental technology required to let, e.g. smart devices follow and respond to our conversations. To achieve an optimal automatic meeting analysis, we previously proposed an all-neural approach that jointly solves source separation, speaker diarization and source counting problems in an optimal way (in a sense that all the 3 tasks can be jointly optimized through error back-propagation). It was shown that the method could well handle simulated clean (noiseless and anechoic) dialog-like data, and achieved very good performance in comparison with several conventional methods. However, it was not clear whether such all-neural approach would be successfully generalized to more complicated real meeting data containing more spontaneously-speaking speakers, severe noise and reverberation, and how it performs in comparison with the state-of-the-art systems in such scenarios. In this paper, we first consider practical issues required for improving the robustness of the all-neural approach, and then experimentally show that, even in real meeting scenarios, the all-neural approach can perform effective speech enhancement, and simultaneously outperform state-of-the-art systems.

Comments:	8 pages, to appear in ICASSP2020
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2003.03987 [eess.AS]
	(or arXiv:2003.03987v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2003.03987

Submission history

From: Keisuke Kinoshita [view email]
[v1] Mon, 9 Mar 2020 09:25:38 UTC (1,026 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators