Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps

Amirian, Mohammadreza; Schwenker, Friedhelm; Stadelmann, Thilo

doi:10.21256/zhaw-3863

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2208

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps

Authors: Mohammadreza Amirian, Friedhelm Schwenker, Thilo Stadelmann

(Submitted on 24 Aug 2022)

Abstract: The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer -- they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers -- "feature responses" to a given input -- have been helpful to visualize for a human "debugger" what the CNN "looks at" while computing its output. In this work, we propose a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet.

Comments:	13 pages, 6 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Journal reference:	8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR 2018)
DOI:	10.21256/zhaw-3863
Report number:	zhaw-3863
Cite as:	arXiv:2208.11436 [cs.CV]
	(or arXiv:2208.11436v1 [cs.CV] for this version)

Submission history

From: Mohammadreza Amirian [view email]
[v1] Wed, 24 Aug 2022 11:05:04 GMT (8216kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2208.11436

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps

Submission history