Computer Science > Machine Learning
[Submitted on 30 May 2018 (v1), last revised 6 Jun 2019 (this version, v2)]
Title:Robustifying Models Against Adversarial Attacks by Langevin Dynamics
View PDFAbstract:Adversarial attacks on deep learning models have compromised their performance considerably. As remedies, a lot of defense methods were proposed, which however, have been circumvented by newer attacking strategies. In the midst of this ensuing arms race, the problem of robustness against adversarial attacks still remains unsolved. This paper proposes a novel, simple yet effective defense strategy where adversarial samples are relaxed onto the underlying manifold of the (unknown) target class distribution. Specifically, our algorithm drives off-manifold adversarial samples towards high density regions of the data generating distribution of the target class by the Metroplis-adjusted Langevin algorithm (MALA) with perceptual boundary taken into account. Although the motivation is similar to projection methods, e.g., Defense-GAN, our algorithm, called MALA for DEfense (MALADE), is equipped with significant dispersion - projection is distributed broadly, and therefore any whitebox attack cannot accurately align the input so that the MALADE moves it to a targeted untrained spot where the model predicts a wrong label. In our experiments, MALADE exhibited state-of-the-art performance against various elaborate attacking strategies.
Submission history
From: Vignesh Srinivasan [view email][v1] Wed, 30 May 2018 15:01:38 UTC (895 KB)
[v2] Thu, 6 Jun 2019 15:25:03 UTC (8,459 KB)
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.