Protecting Privacy in Classifiers by Token Manipulation

Harel, Re'em; Elboher, Yair; Pinter, Yuval

Computer Science > Computation and Language

arXiv:2407.01334 (cs)

[Submitted on 1 Jul 2024 (v1), last revised 3 Jul 2024 (this version, v2)]

Title:Protecting Privacy in Classifiers by Token Manipulation

Authors:Re'em Harel, Yair Elboher, Yuval Pinter

View PDF HTML (experimental)

Abstract:Using language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the prospects of avoiding such data exposure at the level of text manipulation. We focus on text classification models, examining various token mapping and contextualized manipulation functions in order to see whether classifier accuracy may be maintained while keeping the original text unrecoverable. We find that although some token mapping functions are easy and straightforward to implement, they heavily influence performance on the downstream task, and via a sophisticated attacker can be reconstructed. In comparison, the contextualized manipulation provides an improvement in performance.

Comments:	PrivateNLP@ACL 2024
Subjects:	Computation and Language (cs.CL); Cryptography and Security (cs.CR)
Cite as:	arXiv:2407.01334 [cs.CL]
	(or arXiv:2407.01334v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.01334

Submission history

From: Re'em Harel [view email]
[v1] Mon, 1 Jul 2024 14:41:59 UTC (7,668 KB)
[v2] Wed, 3 Jul 2024 16:31:52 UTC (7,668 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-07

Change to browse by:

cs
cs.CR

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Protecting Privacy in Classifiers by Token Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Protecting Privacy in Classifiers by Token Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators