High Accuracy and High Fidelity Extraction of Neural Networks

Jagielski, Matthew; Carlini, Nicholas; Berthelot, David; Kurakin, Alex; Papernot, Nicolas

Computer Science > Machine Learning

arXiv:1909.01838 (cs)

[Submitted on 3 Sep 2019 (v1), last revised 3 Mar 2020 (this version, v2)]

Title:High Accuracy and High Fidelity Extraction of Neural Networks

Authors:Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, Nicolas Papernot

View PDF

Abstract:In a model extraction attack, an adversary steals a copy of a remotely deployed machine learning model, given oracle prediction access. We taxonomize model extraction attacks around two objectives: *accuracy*, i.e., performing well on the underlying learning task, and *fidelity*, i.e., matching the predictions of the remote victim classifier on any input.
To extract a high-accuracy model, we develop a learning-based attack exploiting the victim to supervise the training of an extracted model. Through analytical and empirical arguments, we then explain the inherent limitations that prevent any learning-based strategy from extracting a truly high-fidelity model---i.e., extracting a functionally-equivalent model whose predictions are identical to those of the victim model on all possible inputs. Addressing these limitations, we expand on prior work to develop the first practical functionally-equivalent extraction attack for direct extraction (i.e., without training) of a model's weights.
We perform experiments both on academic datasets and a state-of-the-art image classifier trained with 1 billion proprietary images. In addition to broadening the scope of model extraction research, our work demonstrates the practicality of model extraction attacks against production-grade systems.

Comments:	USENIX Security 2020, 18 pages, 6 figures
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Cite as:	arXiv:1909.01838 [cs.LG]
	(or arXiv:1909.01838v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1909.01838

Submission history

From: Matthew Jagielski [view email]
[v1] Tue, 3 Sep 2019 17:33:09 UTC (636 KB)
[v2] Tue, 3 Mar 2020 22:08:02 UTC (323 KB)

Computer Science > Machine Learning

Title:High Accuracy and High Fidelity Extraction of Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:High Accuracy and High Fidelity Extraction of Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators