Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

Huang, Kuan-Po; Yang, Chih-Kai; Fu, Yu-Kuan; Dunbar, Ewan; Lee, Hung-yi

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2310.03018 (eess)

[Submitted on 4 Oct 2023 (v1), last revised 18 Mar 2024 (this version, v3)]

Title:Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

Authors:Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-yi Lee

View PDF HTML (experimental)

Abstract:We introduce a new zero resource code-switched speech benchmark designed to directly assess the code-switching capabilities of self-supervised speech encoders. We showcase a baseline system of language modeling on discrete units to demonstrate how the code-switching abilities of speech encoders can be assessed in a zero-resource manner. Our experiments encompass a variety of well-known speech encoders, including Wav2vec 2.0, HuBERT, XLSR, etc. We examine the impact of pre-training languages and model size on benchmark performance. Notably, though our results demonstrate that speech encoders with multilingual pre-training, exemplified by XLSR, outperform monolingual variants (Wav2vec 2.0, HuBERT) in code-switching scenarios, there is still substantial room for improvement in their code-switching linguistic abilities.

Comments:	Accepted by ICASSP 2024 (v2)
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Cite as:	arXiv:2310.03018 [eess.AS]
	(or arXiv:2310.03018v3 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2310.03018

Submission history

From: Kuan-Po Huang [view email]
[v1] Wed, 4 Oct 2023 17:58:11 UTC (2,053 KB)
[v2] Sun, 17 Dec 2023 01:49:18 UTC (2,401 KB)
[v3] Mon, 18 Mar 2024 07:57:58 UTC (2,401 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators