On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization

Voitovych, Sasha; Haghifam, Mahdi; Attias, Idan; Dziugaite, Gintare Karolina; Livni, Roi; Roy, Daniel M.

Computer Science > Machine Learning

arXiv:2502.17384 (cs)

[Submitted on 24 Feb 2025]

Title:On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization

Authors:Sasha Voitovych, Mahdi Haghifam, Idan Attias, Gintare Karolina Dziugaite, Roi Livni, Daniel M. Roy

View PDF HTML (experimental)

Abstract:In this paper, we investigate the necessity of memorization in stochastic convex optimization (SCO) under $\ell_p$ geometries. Informally, we say a learning algorithm memorizes $m$ samples (or is $m$-traceable) if, by analyzing its output, it is possible to identify at least $m$ of its training samples. Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO. For every $p\in [1,\infty)$, we establish the existence of a risk threshold below which any sample-efficient learner must memorize a \em{constant fraction} of its sample. For $p\in [1,2]$, this threshold coincides with best risk of differentially private (DP) algorithms, i.e., above this threshold, there are algorithms that do not memorize even a single sample. This establishes a sharp dichotomy between privacy and traceability for $p \in [1,2]$. For $p \in (2,\infty)$, this threshold instead gives novel lower bounds for DP learning, partially closing an open problem in this setup. En route of proving these results, we introduce a complexity notion we term \em{trace value} of a problem, which unifies privacy lower bounds and traceability results, and prove a sparse variant of the fingerprinting lemma.

Comments:	53 Pages
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2502.17384 [cs.LG]
	(or arXiv:2502.17384v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.17384

Submission history

From: Mahdi Haghifam [view email]
[v1] Mon, 24 Feb 2025 18:10:06 UTC (187 KB)

Computer Science > Machine Learning

Title:On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators