A Customized Text Sanitization Mechanism with Differential Privacy

Chen, Huimin; Mo, Fengran; Wang, Yanhao; Chen, Cen; Nie, Jian-Yun; Wang, Chengyu; Cui, Jamie

doi:10.18653/v1/2023.findings-acl.355

Computer Science > Cryptography and Security

arXiv:2207.01193 (cs)

[Submitted on 4 Jul 2022 (v1), last revised 23 May 2023 (this version, v2)]

Title:A Customized Text Sanitization Mechanism with Differential Privacy

Authors:Huimin Chen, Fengran Mo, Yanhao Wang, Cen Chen, Jian-Yun Nie, Chengyu Wang, Jamie Cui

View PDF

Abstract:As privacy issues are receiving increasing attention within the Natural Language Processing (NLP) community, numerous methods have been proposed to sanitize texts subject to differential privacy. However, the state-of-the-art text sanitization mechanisms based on metric local differential privacy (MLDP) do not apply to non-metric semantic similarity measures and cannot achieve good trade-offs between privacy and utility. To address the above limitations, we propose a novel Customized Text (CusText) sanitization mechanism based on the original $\epsilon$-differential privacy (DP) definition, which is compatible with any similarity measure. Furthermore, CusText assigns each input token a customized output set of tokens to provide more advanced privacy protection at the token level. Extensive experiments on several benchmark datasets show that CusText achieves a better trade-off between privacy and utility than existing mechanisms. The code is available at this https URL.

Comments:	This work has been accepted to the Findings of ACL 2023
Subjects:	Cryptography and Security (cs.CR); Computation and Language (cs.CL)
Cite as:	arXiv:2207.01193 [cs.CR]
	(or arXiv:2207.01193v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2207.01193
Journal reference:	https://aclanthology.org/2023.findings-acl.355/
Related DOI:	https://doi.org/10.18653/v1/2023.findings-acl.355

Submission history

From: Huimin Chen [view email]
[v1] Mon, 4 Jul 2022 04:37:42 UTC (776 KB)
[v2] Tue, 23 May 2023 04:52:19 UTC (178 KB)

Computer Science > Cryptography and Security

Title:A Customized Text Sanitization Mechanism with Differential Privacy

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:A Customized Text Sanitization Mechanism with Differential Privacy

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators