Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

Grosse, Kathrin; Lee, Taesung; Biggio, Battista; Park, Youngja; Backes, Michael; Molloy, Ian

Computer Science > Machine Learning

arXiv:2006.06721 (cs)

[Submitted on 11 Jun 2020 (v1), last revised 2 Nov 2021 (this version, v4)]

Title:Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

Authors:Kathrin Grosse, Taesung Lee, Battista Biggio, Youngja Park, Michael Backes, Ian Molloy

View PDF

Abstract:Backdoor attacks mislead machine-learning models to output an attacker-specified class when presented a specific trigger at test time. These attacks require poisoning the training data to compromise the learning algorithm, e.g., by injecting poisoning samples containing the trigger into the training set, along with the desired class label. Despite the increasing number of studies on backdoor attacks and defenses, the underlying factors affecting the success of backdoor attacks, along with their impact on the learning algorithm, are not yet well understood. In this work, we aim to shed light on this issue by unveiling that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as \textit{backdoor smoothing}. To quantify backdoor smoothing, we define a measure that evaluates the uncertainty associated to the predictions of a classifier around the input samples.
Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
We also provide preliminary evidence that backdoor triggers are not the only smoothing-inducing patterns, but that also other artificial patterns can be detected by our approach, paving the way towards understanding the limitations of current defenses and designing novel ones.

Comments:	9 pages, 7 figures, under submission
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
Cite as:	arXiv:2006.06721 [cs.LG]
	(or arXiv:2006.06721v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2006.06721

Submission history

From: Kathrin Grosse [view email]
[v1] Thu, 11 Jun 2020 18:28:54 UTC (3,202 KB)
[v2] Thu, 18 Jun 2020 07:31:59 UTC (3,202 KB)
[v3] Thu, 10 Jun 2021 14:58:34 UTC (4,568 KB)
[v4] Tue, 2 Nov 2021 11:24:11 UTC (8,765 KB)

Computer Science > Machine Learning

Title:Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators