Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion

Sokolov, Alex; Rohlin, Tracy; Rastrow, Ariya

Computer Science > Computation and Language

arXiv:2006.14194 (cs)

[Submitted on 25 Jun 2020 (v1), last revised 28 Jun 2020 (this version, v2)]

Title:Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion

Authors:Alex Sokolov, Tracy Rohlin, Ariya Rastrow

View PDF

Abstract:Grapheme-to-phoneme (G2P) models are a key component in Automatic Speech Recognition (ASR) systems, such as the ASR system in Alexa, as they are used to generate pronunciations for out-of-vocabulary words that do not exist in the pronunciation lexicons (mappings like "e c h o" to "E k oU"). Most G2P systems are monolingual and based on traditional joint-sequence based n-gram models [1,2]. As an alternative, we present a single end-to-end trained neural G2P model that shares same encoder and decoder across multiple languages. This allows the model to utilize a combination of universal symbol inventories of Latin-like alphabets and cross-linguistically shared feature representations. Such model is especially useful in the scenarios of low resource languages and code switching/foreign words, where the pronunciations in one language need to be adapted to other locales or accents. We further experiment with word language distribution vector as an additional training target in order to improve system performance by helping the model decouple pronunciations across a variety of languages in the parameter space. We show 7.2% average improvement in phoneme error rate over low resource languages and no degradation over high resource ones compared to monolingual baselines.

Comments:	Published in INTERSPEECH (2019)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2006.14194 [cs.CL]
	(or arXiv:2006.14194v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2006.14194

Submission history

From: Alex Sokolov [view email]
[v1] Thu, 25 Jun 2020 06:16:29 UTC (138 KB)
[v2] Sun, 28 Jun 2020 23:36:47 UTC (138 KB)

Computer Science > Computation and Language

Title:Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators