Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection

Lee, Beomseok; Gaido, Marco; Calapodescu, Ioan; Besacier, Laurent; Negri, Matteo

Computer Science > Computation and Language

arXiv:2412.11978 (cs)

[Submitted on 16 Dec 2024]

Title:Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection

Authors:Beomseok Lee, Marco Gaido, Ioan Calapodescu, Laurent Besacier, Matteo Negri

View PDF HTML (experimental)

Abstract:While crowdsourcing is an established solution for facilitating and scaling the collection of speech data, the involvement of non-experts necessitates protocols to ensure final data quality. To reduce the costs of these essential controls, this paper investigates the use of Speech Foundation Models (SFMs) to automate the validation process, examining for the first time the cost/quality trade-off in data acquisition. Experiments conducted on French, German, and Korean data demonstrate that SFM-based validation has the potential to reduce reliance on human validation, resulting in an estimated cost saving of over 40.0% without degrading final data quality. These findings open new opportunities for more efficient, cost-effective, and scalable speech data acquisition.

Comments:	Accepted at COLING 2025 main conference
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2412.11978 [cs.CL]
	(or arXiv:2412.11978v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.11978

Submission history

From: Beomseok Lee [view email]
[v1] Mon, 16 Dec 2024 16:59:22 UTC (1,395 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-12

Change to browse by:

cs
cs.SD
eess
eess.AS

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators