IEEE Trans Pattern Anal Mach Intell. 2022 Apr;44(4):2188-2197. doi: 10.1109/TPAMI.2020.3033291. Epub 2022 Mar 4.
Adversarial attacks on deep neural networks (DNNs) have been found for several years. However, the existing adversarial attacks have high success rates only when the information of the victim DNN is well-known or could be estimated by the structure similarity or massive queries. In this paper, we propose to Attack on Attention (AoA), a semantic property commonly shared by DNNs. AoA enjoys a significant increase in transferability when the traditional cross entropy loss is replaced with the attention loss. Since AoA alters the loss function only, it could be easily combined with other transferability-enhancement techniques and then achieve SOTA performance. We apply AoA to generate 50000 adversarial samples from ImageNet validation set to defeat many neural networks, and thus name the dataset as DAmageNet. 13 well-trained DNNs are tested on DAmageNet, and all of them have an error rate over 85 percent. Even with defenses or adversarial training, most models still maintain an error rate over 70 percent on DAmageNet. DAmageNet is the first universal adversarial dataset. It could be downloaded freely and serve as a benchmark for robustness testing and adversarial training.
对抗攻击已经存在多年了。然而,现有的对抗攻击只有在受害者 DNN 的信息被很好地了解或者可以通过结构相似性或大量查询来估计时,才具有很高的成功率。在本文中,我们提出了攻击注意力(AoA),这是 DNN 普遍具有的一种语义属性。当传统的交叉熵损失被注意力损失取代时,AoA 的可转移性显著提高。由于 AoA 只改变损失函数,因此可以很容易地与其他提高可转移性的技术结合,从而达到 SOTA 性能。我们应用 AoA 从 ImageNet 验证集生成 50000 个对抗样本来击败许多神经网络,因此将该数据集命名为 DAmageNet。我们在 DAmageNet 上测试了 13 个训练有素的 DNN,它们的错误率都超过了 85%。即使有防御措施或对抗训练,大多数模型在 DAmageNet 上的错误率仍然保持在 70%以上。DAmageNet 是第一个通用的对抗性数据集。它可以免费下载,并作为鲁棒性测试和对抗性训练的基准。