On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

Zhu, Yi; Imoussaïne-Aïkous, Mohamed; Côté-Lussier, Carolyn; Falk, Tiago H.

Computer Science > Computation and Language

arXiv:2304.02181v1 (cs)

COVID-19 e-print

Important: e-prints posted on arXiv are not peer-reviewed by arXiv; they should not be relied upon without context to guide clinical practice or health-related behavior and should not be reported in news media as established information without consulting multiple experts in the field.

[Submitted on 5 Apr 2023 (this version), latest version 26 Jun 2024 (v2)]

Title:On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

Authors:Yi Zhu, Mohamed Imoussaïne-Aïkous, Carolyn Côté-Lussier, Tiago H. Falk

View PDF

Abstract:With advances seen in deep learning, voice-based applications are burgeoning, ranging from personal assistants, affective computing, to remote disease diagnostics. As the voice contains both linguistic and paralinguistic information (e.g., vocal pitch, intonation, speech rate, loudness), there is growing interest in voice anonymization to preserve speaker privacy and identity. Voice privacy challenges have emerged over the last few years and focus has been placed on removing speaker identity while keeping linguistic content intact. For affective computing and disease monitoring applications, however, the paralinguistic content may be more critical. Unfortunately, the effects that anonymization may have on these systems are still largely unknown. In this paper, we fill this gap and focus on one particular health monitoring application: speech-based COVID-19 diagnosis. We test two popular anonymization methods and their impact on five different state-of-the-art COVID-19 diagnostic systems using three public datasets. We validate the effectiveness of the anonymization methods, compare their computational complexity, and quantify the impact across different testing scenarios for both within- and across-dataset conditions. Lastly, we show the benefits of anonymization as a data augmentation tool to help recover some of the COVID-19 diagnostic accuracy loss seen with anonymized data.

Comments:	11 pages, 10 figures
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2304.02181 [cs.CL]
	(or arXiv:2304.02181v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.02181

Submission history

From: Yi Zhu [view email]
[v1] Wed, 5 Apr 2023 01:09:58 UTC (6,755 KB)
[v2] Wed, 26 Jun 2024 17:58:42 UTC (41,907 KB)

Computer Science > Computation and Language

Title:On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the Impact of Voice Anonymization on Speech-Based COVID-19 Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators