Fixed-point quantization aware training for on-device keyword-spotting

Macha, Sashank; Oza, Om; Escott, Alex; Caliva, Francesco; Armitano, Robbie; Cheekatmalla, Santosh Kumar; Parthasarathi, Sree Hari Krishnan; Liu, Yuzong

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2303.02284 (eess)

[Submitted on 4 Mar 2023]

Title:Fixed-point quantization aware training for on-device keyword-spotting

Authors:Sashank Macha, Om Oza, Alex Escott, Francesco Caliva, Robbie Armitano, Santosh Kumar Cheekatmalla, Sree Hari Krishnan Parthasarathi, Yuzong Liu

View PDF

Abstract:Fixed-point (FXP) inference has proven suitable for embedded devices with limited computational resources, and yet model training is continually performed in floating-point (FLP). FXP training has not been fully explored and the non-trivial conversion from FLP to FXP presents unavoidable performance drop. We propose a novel method to train and obtain FXP convolutional keyword-spotting (KWS) models. We combine our methodology with two quantization-aware-training (QAT) techniques - squashed weight distribution and absolute cosine regularization for model parameters, and propose techniques for extending QAT over transient variables, otherwise neglected by previous paradigms. Experimental results on the Google Speech Commands v2 dataset show that we can reduce model precision up to 4-bit with no loss in accuracy. Furthermore, on an in-house KWS dataset, we show that our 8-bit FXP-QAT models have a 4-6% improvement in relative false discovery rate at fixed false reject rate compared to full precision FLP models. During inference we argue that FXP-QAT eliminates q-format normalization and enables the use of low-bit accumulators while maximizing SIMD throughput to reduce user perceived latency. We demonstrate that we can reduce execution time by 68% without compromising KWS model's predictive performance or requiring model architectural changes. Our work provides novel findings that aid future research in this area and enable accurate and efficient models.

Comments:	5 pages, 3 figures, 4 tables
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
Cite as:	arXiv:2303.02284 [eess.AS]
	(or arXiv:2303.02284v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2303.02284
Journal reference:	ICASSP 2023

Submission history

From: Sashank Macha [view email]
[v1] Sat, 4 Mar 2023 01:06:16 UTC (193 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Fixed-point quantization aware training for on-device keyword-spotting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Fixed-point quantization aware training for on-device keyword-spotting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators