Agarwal Akshay, Goswami Gaurav, Vatsa Mayank, Singh Richa, Ratha Nalini K
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3277-3289. doi: 10.1109/TNNLS.2021.3051529. Epub 2022 Aug 3.
Adversarial perturbations have demonstrated the vulnerabilities of deep learning algorithms to adversarial attacks. Existing adversary detection algorithms attempt to detect the singularities; however, they are in general, loss-function, database, or model dependent. To mitigate this limitation, we propose DAMAD-a generalized perturbation detection algorithm which is agnostic to model architecture, training data set, and loss function used during training. The proposed adversarial perturbation detection algorithm is based on the fusion of autoencoder embedding and statistical texture features extracted from convolutional neural networks. The performance of DAMAD is evaluated on the challenging scenarios of cross-database, cross-attack, and cross-architecture training and testing along with traditional evaluation of testing on the same database with known attack and model. Comparison with state-of-the-art perturbation detection algorithms showcase the effectiveness of the proposed algorithm on six databases: ImageNet, CIFAR-10, Multi-PIE, MEDS, point and shoot challenge (PaSC), and MNIST. Performance evaluation with nearly a quarter of a million adversarial and original images and comparison with recent algorithms show the effectiveness of the proposed algorithm.
对抗性扰动已证明深度学习算法易受对抗性攻击。现有的对抗检测算法试图检测这些异常情况;然而,它们通常依赖于损失函数、数据库或模型。为了缓解这一限制,我们提出了DAMAD——一种广义扰动检测算法,它与模型架构、训练数据集以及训练期间使用的损失函数无关。所提出的对抗性扰动检测算法基于自动编码器嵌入与从卷积神经网络中提取的统计纹理特征的融合。DAMAD的性能在跨数据库、跨攻击和跨架构训练与测试的具有挑战性的场景中进行评估,同时还包括在具有已知攻击和模型的同一数据库上进行测试的传统评估。与现有最先进的扰动检测算法的比较表明,该算法在六个数据库上是有效的:ImageNet、CIFAR-10、Multi-PIE、MEDS、即拍即得挑战(PaSC)和MNIST。使用近25万张对抗性图像和原始图像进行的性能评估以及与最近算法的比较表明了该算法的有效性。