Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting

López-Espejo, Iván; Shekar, Ram C. M. C.; Tan, Zheng-Hua; Jensen, Jesper; Hansen, John H. L.

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2211.10565 (eess)

[Submitted on 19 Nov 2022 (v1), last revised 23 Feb 2023 (this version, v2)]

Title:Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting

Authors:Iván López-Espejo, Ram C. M. C. Shekar, Zheng-Hua Tan, Jesper Jensen, John H. L. Hansen

View PDF

Abstract:In the context of keyword spotting (KWS), the replacement of handcrafted speech features by learnable features has not yielded superior KWS performance. In this study, we demonstrate that filterbank learning outperforms handcrafted speech features for KWS whenever the number of filterbank channels is severely decreased. Reducing the number of channels might yield certain KWS performance drop, but also a substantial energy consumption reduction, which is key when deploying common always-on KWS on low-resource devices. Experimental results on a noisy version of the Google Speech Commands Dataset show that filterbank learning adapts to noise characteristics to provide a higher degree of robustness to noise, especially when dropout is integrated. Thus, switching from typically used 40-channel log-Mel features to 8-channel learned features leads to a relative KWS accuracy loss of only 3.5% while simultaneously achieving a 6.3x energy consumption reduction.

Subjects:	Audio and Speech Processing (eess.AS); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2211.10565 [eess.AS]
	(or arXiv:2211.10565v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2211.10565

Submission history

From: Iván López-Espejo [view email]
[v1] Sat, 19 Nov 2022 02:20:14 UTC (282 KB)
[v2] Thu, 23 Feb 2023 21:38:32 UTC (282 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators