Fundamental Limits of Perfect Concept Erasure

Chowdhury, Somnath Basu Roy; Dubey, Avinava; Beirami, Ahmad; Kidambi, Rahul; Monath, Nicholas; Ahmed, Amr; Chaturvedi, Snigdha

Computer Science > Machine Learning

arXiv:2503.20098 (cs)

[Submitted on 25 Mar 2025]

Title:Fundamental Limits of Perfect Concept Erasure

Authors:Somnath Basu Roy Chowdhury, Avinava Dubey, Ahmad Beirami, Rahul Kidambi, Nicholas Monath, Amr Ahmed, Snigdha Chaturvedi

View PDF HTML (experimental)

Abstract:Concept erasure is the task of erasing information about a concept (e.g., gender or race) from a representation set while retaining the maximum possible utility -- information from original representations. Concept erasure is useful in several applications, such as removing sensitive concepts to achieve fairness and interpreting the impact of specific concepts on a model's performance. Previous concept erasure techniques have prioritized robustly erasing concepts over retaining the utility of the resultant representations. However, there seems to be an inherent tradeoff between erasure and retaining utility, making it unclear how to achieve perfect concept erasure while maintaining high utility. In this paper, we offer a fresh perspective toward solving this problem by quantifying the fundamental limits of concept erasure through an information-theoretic lens. Using these results, we investigate constraints on the data distribution and the erasure functions required to achieve the limits of perfect concept erasure. Empirically, we show that the derived erasure functions achieve the optimal theoretical bounds. Additionally, we show that our approach outperforms existing methods on a range of synthetic and real-world datasets using GPT-4 representations.

Comments:	Accepted at AISTATS 2025
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2503.20098 [cs.LG]
	(or arXiv:2503.20098v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2503.20098

Submission history

From: Somnath Basu Roy Chowdhury [view email]
[v1] Tue, 25 Mar 2025 22:36:10 UTC (2,236 KB)

Computer Science > Machine Learning

Title:Fundamental Limits of Perfect Concept Erasure

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Fundamental Limits of Perfect Concept Erasure

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators