Speaker Change Detection Using Features through A Neural Network Speaker Classifier

Ge, Zhenhao; Iyer, Ananth N.; Cheluvaraja, Srinath; Ganapathiraju, Aravind

Computer Science > Sound

arXiv:1702.02285 (cs)

[Submitted on 8 Feb 2017]

Title:Speaker Change Detection Using Features through A Neural Network Speaker Classifier

Authors:Zhenhao Ge, Ananth N. Iyer, Srinath Cheluvaraja, Aravind Ganapathiraju

View PDF

Abstract:The mechanism proposed here is for real-time speaker change detection in conversations, which firstly trains a neural network text-independent speaker classifier using in-domain speaker data. Through the network, features of conversational speech from out-of-domain speakers are then converted into likelihood vectors, i.e. similarity scores comparing to the in-domain speakers. These transformed features demonstrate very distinctive patterns, which facilitates differentiating speakers and enable speaker change detection with some straight-forward distance metrics. The speaker classifier and the speaker change detector are trained/tested using speech of the first 200 (in-domain) and the remaining 126 (out-of-domain) male speakers in TIMIT respectively. For the speaker classification, 100% accuracy at a 200 speaker size is achieved on any testing file, given the speech duration is at least 0.97 seconds. For the speaker change detection using speaker classification outputs, performance based on 0.5, 1, and 2 seconds of inspection intervals were evaluated in terms of error rate and F1 score, using synthesized data by concatenating speech from various speakers. It captures close to 97% of the changes by comparing the current second of speech with the previous second, which is very competitive among literature using other methods.

Comments:	Intelligent System Conference 2017, Sep. 7-8, 2017, London, UK. arXiv admin note: text overlap with arXiv:1702.02289
Subjects:	Sound (cs.SD)
Cite as:	arXiv:1702.02285 [cs.SD]
	(or arXiv:1702.02285v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1702.02285

Submission history

From: Zhenhao Ge [view email]
[v1] Wed, 8 Feb 2017 04:37:40 UTC (551 KB)

Computer Science > Sound

Title:Speaker Change Detection Using Features through A Neural Network Speaker Classifier

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Speaker Change Detection Using Features through A Neural Network Speaker Classifier

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators