Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples in Pre-trained CNNs

Rajabi, Arezoo; Bobba, Rakesh B.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2011.09123 (cs)

[Submitted on 18 Nov 2020]

Title:Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples in Pre-trained CNNs

Authors:Arezoo Rajabi, Rakesh B. Bobba

View PDF

Abstract:Despite high accuracy of Convolutional Neural Networks (CNNs), they are vulnerable to adversarial and out-distribution examples. There are many proposed methods that tend to detect or make CNNs robust against these fooling examples. However, most such methods need access to a wide range of fooling examples to retrain the network or to tune detection parameters. Here, we propose a method to detect adversarial and out-distribution examples against a pre-trained CNN without needing to retrain the CNN or needing access to a wide variety of fooling examples. To this end, we create adversarial profiles for each class using only one adversarial attack generation technique. We then wrap a detector around the pre-trained CNN that applies the created adversarial profile to each input and uses the output to decide whether or not the input is legitimate. Our initial evaluation of this approach using MNIST dataset show that adversarial profile based detection is effective in detecting at least 92 of out-distribution examples and 59% of adversarial examples.

Comments:	Accepted on DSN Workshop on Dependable and Secure Machine Learning 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2011.09123 [cs.CV]
	(or arXiv:2011.09123v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2011.09123
Journal reference:	DSN Workshop on Dependable and Secure Machine Learning (DSML 2019)

Submission history

From: Arezoo Rajabi [view email]
[v1] Wed, 18 Nov 2020 07:10:13 UTC (917 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples in Pre-trained CNNs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples in Pre-trained CNNs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators