Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning

Yang, Qiaoyu; Zhang, Shuo; Huang, Chuan-Che

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2502.00295 (eess)

[Submitted on 1 Feb 2025]

Title:Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning

Authors:Qiaoyu Yang, Shuo Zhang, Chuan-Che Huang

View PDF HTML (experimental)

Abstract:The expanding feature set of modern headphones puts a challenge on the design of their control interface. Users may want to separately control each feature or quickly switch between modes that activate different features. Traditional approach of physical buttons may no longer be feasible when the feature set is large. Keyword spotting with voice commands is a promising solution to the issue. Most existing methods of keyword spotting only support commands spoken in a regular voice. However, regular voice may not be desirable in quiet places or public settings. In this paper, we investigate the problem of on-device keyword spotting in whisper voice and explore approaches to improve noise robustness. We leverage the inner microphone on noise-cancellation headphones as an additional source of voice input. We also design a curriculum learning strategy that gradually increases the proportion of whisper keywords during training. We demonstrate through experiments that the combination of multi-microphone processing and curriculum learning could improve F1 score of whisper keyword spotting by up to 15% in noisy conditions.

Comments:	Accepted to ICASSP 2025
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2502.00295 [eess.AS]
	(or arXiv:2502.00295v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2502.00295

Submission history

From: Qiaoyu Yang [view email]
[v1] Sat, 1 Feb 2025 03:37:17 UTC (432 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators