Searching for the Essence of Adversarial Perturbations

Menn, Dennis Y.; Feng, Tzu-hsun; Lee, Hung-yi

Computer Science > Machine Learning

arXiv:2205.15357 (cs)

[Submitted on 30 May 2022 (v1), last revised 3 Feb 2023 (this version, v3)]

Title:Searching for the Essence of Adversarial Perturbations

Authors:Dennis Y. Menn, Tzu-hsun Feng, Hung-yi Lee

View PDF

Abstract:Neural networks have demonstrated state-of-the-art performance in various machine learning fields. However, the introduction of malicious perturbations in input data, known as adversarial examples, has been shown to deceive neural network predictions. This poses potential risks for real-world applications such as autonomous driving and text identification. In order to mitigate these risks, a comprehensive understanding of the mechanisms underlying adversarial examples is essential. In this study, we demonstrate that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's incorrect prediction, in contrast to the widely held belief that human-unidentifiable characteristics play a critical role in fooling a network. This concept of human-recognizable characteristics enables us to explain key features of adversarial perturbations, including their existence, transferability among different neural networks, and increased interpretability for adversarial training. We also uncover two unique properties of adversarial perturbations that deceive neural networks: masking and generation. Additionally, a special class, the complementary class, is identified when neural networks classify input images. The presence of human-recognizable information in adversarial perturbations allows researchers to gain insight into the working principles of neural networks and may lead to the development of techniques for detecting and defending against adversarial attacks.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:2205.15357 [cs.LG]
	(or arXiv:2205.15357v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.15357

Submission history

From: Dennis Menn [view email]
[v1] Mon, 30 May 2022 18:04:57 UTC (1,228 KB)
[v2] Sat, 6 Aug 2022 02:34:36 UTC (1,336 KB)
[v3] Fri, 3 Feb 2023 10:38:51 UTC (3,742 KB)

Computer Science > Machine Learning

Title:Searching for the Essence of Adversarial Perturbations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Searching for the Essence of Adversarial Perturbations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators