Real-time speech enhancement using equilibriated RNN

Takeuchi, Daiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2002.05843 (eess)

[Submitted on 14 Feb 2020]

Title:Real-time speech enhancement using equilibriated RNN

Authors:Daiki Takeuchi, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

View PDF

Abstract:We propose a speech enhancement method using a causal deep neural network~(DNN) for real-time applications. DNN has been widely used for estimating a time-frequency~(T-F) mask which enhances a speech signal. One popular DNN structure for that is a recurrent neural network~(RNN) owing to its capability of effectively modelling time-sequential data like speech. In particular, the long short-term memory (LSTM) is often used to alleviate the vanishing/exploding gradient problem which makes the training of an RNN difficult. However, the number of parameters of LSTM is increased as the price of mitigating the difficulty of training, which requires more computational resources. For real-time speech enhancement, it is preferable to use a smaller network without losing the performance. In this paper, we propose to use the equilibriated recurrent neural network~(ERNN) for avoiding the vanishing/exploding gradient problem without increasing the number of parameters. The proposed structure is causal, which requires only the information from the past, in order to apply it in real-time. Compared to the uni- and bi-directional LSTM networks, the proposed method achieved the similar performance with much fewer parameters.

Comments:	To appear in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020)
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2002.05843 [eess.AS]
	(or arXiv:2002.05843v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2002.05843

Submission history

From: Daiki Takeuchi [view email]
[v1] Fri, 14 Feb 2020 02:00:13 UTC (1,018 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Real-time speech enhancement using equilibriated RNN

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Real-time speech enhancement using equilibriated RNN

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators