Spoken Language Intent Detection using Confusion2Vec

Shivakumar, Prashanth Gurunath; Yang, Mu; Georgiou, Panayiotis

doi:10.21437/Interspeech.2019-2226

Computer Science > Computation and Language

arXiv:1904.03576 (cs)

[Submitted on 7 Apr 2019 (v1), last revised 2 Jul 2019 (this version, v3)]

Title:Spoken Language Intent Detection using Confusion2Vec

Authors:Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

View PDF

Abstract:Decoding speaker's intent is a crucial part of spoken language understanding (SLU). The presence of noise or errors in the text transcriptions, in real life scenarios make the task more challenging. In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems. We propose to employ confusion2vec word feature representation to compensate for the errors made by ASR and to increase the robustness of the SLU system. The confusion2vec, motivated from human speech production and perception, models acoustic relationships between words in addition to the semantic and syntactic relations of words in human language. We hypothesize that ASR often makes errors relating to acoustically similar words, and the confusion2vec with inherent model of acoustic relationships between words is able to compensate for the errors. We demonstrate through experiments on the ATIS benchmark dataset, the robustness of the proposed model to achieve state-of-the-art results under noisy ASR conditions. Our system reduces classification error rate (CER) by 20.84% and improves robustness by 37.48% (lower CER degradation) relative to the previous state-of-the-art going from clean to noisy transcripts. Improvements are also demonstrated when training the intent detection models on noisy transcripts.

Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Report number:	2226
Cite as:	arXiv:1904.03576 [cs.CL]
	(or arXiv:1904.03576v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1904.03576
Journal reference:	Proceedings of Interspeech 2019
Related DOI:	https://doi.org/10.21437/Interspeech.2019-2226

Submission history

From: Panayiotis Georgiou [view email]
[v1] Sun, 7 Apr 2019 03:18:44 UTC (156 KB)
[v2] Thu, 27 Jun 2019 23:14:58 UTC (156 KB)
[v3] Tue, 2 Jul 2019 00:31:05 UTC (156 KB)

Computer Science > Computation and Language

Title:Spoken Language Intent Detection using Confusion2Vec

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Spoken Language Intent Detection using Confusion2Vec

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators