A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

Wang, Tianrui; Zhu, Weibin

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2108.11877 (eess)

This paper has been withdrawn by Tianrui Wang

[Submitted on 26 Aug 2021 (v1), last revised 23 Apr 2023 (this version, v4)]

Title:A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

Authors:Tianrui Wang, Weibin Zhu

No PDF available, click to view other formats

Abstract:Deep learning technology has been widely applied to speech enhancement. While testing the effectiveness of various network structures, researchers are also exploring the improvement of the loss function used in network training. Although the existing methods have considered the auditory characteristics of speech or the reasonable expression of signal-to-noise ratio, the correlation with the auditory evaluation score and the applicability of the calculation for gradient optimization still need to be improved. In this paper, a signal-to-noise ratio loss function based on auditory power compression is proposed. The experimental results show that the overall correlation between the proposed function and the indexes of objective speech intelligibility, which is better than other loss functions. For the same speech enhancement model, the training effect of this method is also better than other comparison methods.

Comments:	This work was carried over into other work and was published
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2108.11877 [eess.AS]
	(or arXiv:2108.11877v4 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2108.11877

Submission history

From: Tianrui Wang [view email]
[v1] Thu, 26 Aug 2021 16:12:38 UTC (419 KB)
[v2] Fri, 27 Aug 2021 12:04:16 UTC (421 KB)
[v3] Thu, 14 Oct 2021 06:45:29 UTC (352 KB)
[v4] Sun, 23 Apr 2023 03:18:33 UTC (1 KB) (withdrawn)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators