Discriminatory and orthogonal feature learning for noise robust keyword spotting

Kim, Donghyeon; Ko, Kyungdeuk; Han, David K.; Ko, Hanseok

doi:10.1109/LSP.2022.3203911

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2210.11519 (eess)

[Submitted on 20 Oct 2022]

Title:Discriminatory and orthogonal feature learning for noise robust keyword spotting

Authors:Donghyeon Kim, Kyungdeuk Ko, David K. Han, Hanseok Ko

View PDF

Abstract:Keyword Spotting (KWS) is an essential component in a smart device for alerting the system when a user prompts it with a command. As these devices are typically constrained by computational and energy resources, the KWS model should be designed with a small footprint. In our previous work, we developed lightweight dynamic filters which extract a robust feature map within a noisy environment. The learning variables of the dynamic filter are jointly optimized with KWS weights by using Cross-Entropy (CE) loss. CE loss alone, however, is not sufficient for high performance when the SNR is low. In order to train the network for more robust performance in noisy environments, we introduce the LOw Variant Orthogonal (LOVO) loss. The LOVO loss is composed of a triplet loss applied on the output of the dynamic filter, a spectral norm-based orthogonal loss, and an inner class distance loss applied in the KWS model. These losses are particularly useful in encouraging the network to extract discriminatory features in unseen noise environments.

Comments:	Published in SPL
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2210.11519 [eess.AS]
	(or arXiv:2210.11519v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2210.11519
Related DOI:	https://doi.org/10.1109/LSP.2022.3203911

Submission history

From: Donghyeon Kim [view email]
[v1] Thu, 20 Oct 2022 18:44:16 UTC (991 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Discriminatory and orthogonal feature learning for noise robust keyword spotting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Discriminatory and orthogonal feature learning for noise robust keyword spotting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators