Differentially Private Speaker Anonymization

Shamsabadi, Ali Shahin; Srivastava, Brij Mohan Lal; Bellet, Aurélien; Vauquier, Nathalie; Vincent, Emmanuel; Maouche, Mohamed; Tommasi, Marc; Papernot, Nicolas

Computer Science > Sound

arXiv:2202.11823v2 (cs)

[Submitted on 23 Feb 2022 (v1), last revised 6 Oct 2022 (this version, v2)]

Title:Differentially Private Speaker Anonymization

Authors:Ali Shahin Shamsabadi, Brij Mohan Lal Srivastava, Aurélien Bellet, Nathalie Vauquier, Emmanuel Vincent, Mohamed Maouche, Marc Tommasi, Nicolas Papernot

View PDF

Abstract:Sharing real-world speech utterances is key to the training and deployment of voice-based services. However, it also raises privacy risks as speech contains a wealth of personal data. Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact. State-of-the-art techniques operate by disentangling the speaker information (represented via a speaker embedding) from these attributes and re-synthesizing speech based on the speaker embedding of another speaker. Prior research in the privacy community has shown that anonymization often provides brittle privacy protection, even less so any provable guarantee. In this work, we show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information. We remove speaker information from these attributes by introducing differentially private feature extractors based on an autoencoder and an automatic speech recognizer, respectively, trained using noise layers. We plug these extractors in the state-of-the-art anonymization pipeline and generate, for the first time, private speech utterances with a provable upper bound on the speaker information they contain. We evaluate empirically the privacy and utility resulting from our differentially private speaker anonymization approach on the LibriSpeech data set. Experimental results show that the generated utterances retain very high utility for automatic speech recognition training and inference, while being much better protected against strong adversaries who leverage the full knowledge of the anonymization process to try to infer the speaker identity.

Subjects:	Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2202.11823 [cs.SD]
	(or arXiv:2202.11823v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2202.11823

Submission history

From: Ali Shahin Shamsabadi [view email]
[v1] Wed, 23 Feb 2022 23:20:30 UTC (620 KB)
[v2] Thu, 6 Oct 2022 09:16:42 UTC (198 KB)

Computer Science > Sound

Title:Differentially Private Speaker Anonymization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Differentially Private Speaker Anonymization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators