Adaptive PII Mitigation Framework for Large Language Models

Asthana, Shubhi; Mahindru, Ruchi; Zhang, Bing; Sanz, Jorge

Computer Science > Machine Learning

arXiv:2501.12465 (cs)

[Submitted on 21 Jan 2025]

Title:Adaptive PII Mitigation Framework for Large Language Models

Authors:Shubhi Asthana, Ruchi Mahindru, Bing Zhang, Jorge Sanz

View PDF HTML (experimental)

Abstract:Artificial Intelligence (AI) faces growing challenges from evolving data protection laws and enforcement practices worldwide. Regulations like GDPR and CCPA impose strict compliance requirements on Machine Learning (ML) models, especially concerning personal data use. These laws grant individuals rights such as data correction and deletion, complicating the training and deployment of Large Language Models (LLMs) that rely on extensive datasets. Public data availability does not guarantee its lawful use for ML, amplifying these challenges.
This paper introduces an adaptive system for mitigating risk of Personally Identifiable Information (PII) and Sensitive Personal Information (SPI) in LLMs. It dynamically aligns with diverse regulatory frameworks and integrates seamlessly into Governance, Risk, and Compliance (GRC) systems. The system uses advanced NLP techniques, context-aware analysis, and policy-driven masking to ensure regulatory compliance.
Benchmarks highlight the system's effectiveness, with an F1 score of 0.95 for Passport Numbers, outperforming tools like Microsoft Presidio (0.33) and Amazon Comprehend (0.54). In human evaluations, the system achieved an average user trust score of 4.6/5, with participants acknowledging its accuracy and transparency. Observations demonstrate stricter anonymization under GDPR compared to CCPA, which permits pseudonymization and user opt-outs. These results validate the system as a scalable and robust solution for enterprise privacy compliance.

Comments:	This paper has been accepted at PPAI-25, the 6th AAAI Workshop on Privacy-Preserving Artificial Intelligence
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Cite as:	arXiv:2501.12465 [cs.LG]
	(or arXiv:2501.12465v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.12465

Submission history

From: Shubhi Asthana [view email]
[v1] Tue, 21 Jan 2025 19:22:45 UTC (2,558 KB)

Computer Science > Machine Learning

Title:Adaptive PII Mitigation Framework for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adaptive PII Mitigation Framework for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators