Algorithms that Approximate Data Removal: New Results and Limitations

Suriyakumar, Vinith M.; Wilson, Ashia C.

Statistics > Machine Learning

arXiv:2209.12269 (stat)

[Submitted on 25 Sep 2022]

Title:Algorithms that Approximate Data Removal: New Results and Limitations

Authors:Vinith M. Suriyakumar, Ashia C. Wilson

View PDF

Abstract:We study the problem of deleting user data from machine learning models trained using empirical risk minimization. Our focus is on learning algorithms which return the empirical risk minimizer and approximate unlearning algorithms that comply with deletion requests that come streaming minibatches. Leveraging the infintesimal jacknife, we develop an online unlearning algorithm that is both computationally and memory efficient. Unlike prior memory efficient unlearning algorithms, we target models that minimize objectives with non-smooth regularizers, such as the commonly used $\ell_1$, elastic net, or nuclear norm penalties. We also provide generalization, deletion capacity, and unlearning guarantees that are consistent with state of the art methods. Across a variety of benchmark datasets, our algorithm empirically improves upon the runtime of prior methods while maintaining the same memory requirements and test accuracy. Finally, we open a new direction of inquiry by proving that all approximate unlearning algorithms introduced so far fail to unlearn in problem settings where common hyperparameter tuning methods, such as cross-validation, have been used to select models.

Comments:	Accepted to NeurIPS 2022
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2209.12269 [stat.ML]
	(or arXiv:2209.12269v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2209.12269

Submission history

From: Vinith Suriyakumar [view email]
[v1] Sun, 25 Sep 2022 17:20:33 UTC (7,899 KB)

Statistics > Machine Learning

Title:Algorithms that Approximate Data Removal: New Results and Limitations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Algorithms that Approximate Data Removal: New Results and Limitations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators