A deep representation learning speech enhancement method using $\beta$-VAE

Xiang, Yang; Højvang, Jesper Lisby; Rasmussen, Morten Højfeldt; Christensen, Mads Græsbøll

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2205.05581 (eess)

[Submitted on 11 May 2022]

Title:A deep representation learning speech enhancement method using $β$-VAE

Authors:Yang Xiang, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, Mads Græsbøll Christensen

View PDF

Abstract:In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training speech enhancement (SE) method (PVAE) which indicated that the SE performance of the traditional deep neural network-based (DNN) method could be improved by deep representation learning (DRL). Based on our previous work, we in this paper propose to use $\beta$-VAE to further improve PVAE's ability of representation learning. More specifically, our $\beta$-VAE can improve PVAE's capacity of disentangling different latent variables from the observed signal without the trade-off problem between disentanglement and signal reconstruction. This trade-off problem widely exists in previous $\beta$-VAE algorithms. Unlike the previous $\beta$-VAE algorithms, the proposed $\beta$-VAE strategy can also be used to optimize the DNN's structure. This means that the proposed method can not only improve PVAE's SE performance but also reduce the number of PVAE training parameters. The experimental results show that the proposed method can acquire better speech and noise latent representation than PVAE. Meanwhile, it also obtains a higher scale-invariant signal-to-distortion ratio, speech quality, and speech intelligibility.

Comments:	Submitted to Eurosipco
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2205.05581 [eess.AS]
	(or arXiv:2205.05581v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2205.05581

Submission history

From: Yang Xiang [view email]
[v1] Wed, 11 May 2022 15:49:16 UTC (1,019 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A deep representation learning speech enhancement method using $β$-VAE

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A deep representation learning speech enhancement method using $β$-VAE

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators