Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Bear, Helen L.; Harvey, Richard W.; Theobald, Barry-John; Lan, Yuxuan

Computer Science > Computer Vision and Pattern Recognition

arXiv:1710.01093 (cs)

[Submitted on 3 Oct 2017]

Title:Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Authors:Helen L. Bear, Richard W. Harvey, Barry-John Theobald, Yuxuan Lan

View PDF

Abstract:A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes. Despite there being a number of published maps it is infrequent to see the effectiveness of these tested, particularly on visual-only lip-reading (many works use audio-visual speech). Here we examine 120 mappings and consider if any are stable across talkers. We show a method for devising maps based on phoneme confusions from an automated lip-reading system, and we present new mappings that show improvements for individual talkers.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1710.01093 [cs.CV]
	(or arXiv:1710.01093v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1710.01093
Journal reference:	Helen L. Bear, Richard W. Harvey, Barry-John Theobald, and Yuxuan Lan. Which phoneme-to-viseme maps best improve visual-only computer lip-reading? Advances in Visual Computing 2014. p230-239

Submission history

From: Helen L Bear [view email]
[v1] Tue, 3 Oct 2017 11:44:40 UTC (90 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators