Computer Science > Computation and Language
[Submitted on 23 Mar 2017 (this version), latest version 5 Sep 2017 (v2)]
Title:An embedded segmental k-means model for unsupervised segmentation and clustering of speech
View PDFAbstract:Unsupervised segmentation and clustering of unlabelled speech are core problems in zero-resource speech processing. Most competitive approaches lie at methodological extremes: some follow a Bayesian approach, defining probabilistic models with convergence guarantees, while others opt for more efficient heuristic techniques. Here we introduce an approximation to a segmental Bayesian model that falls in between, with a clear objective function but using hard clustering and segmentation rather than full Bayesian inference. Like its Bayesian counterpart, this embedded segmental k-means model (ES-KMeans) represents arbitrary-length word segments as fixed-dimensional acoustic word embeddings. On English and Xitsonga data, ES-KMeans outperforms a leading heuristic method in word segmentation, giving similar scores to the Bayesian model while being five times faster with fewer hyperparameters. However, there is a trade-off in cluster purity, with the Bayesian model's purer clusters yielding about 10% better unsupervised word error rates.
Submission history
From: Herman Kamper [view email][v1] Thu, 23 Mar 2017 16:45:22 UTC (527 KB)
[v2] Tue, 5 Sep 2017 14:14:11 UTC (272 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.