Zhao Jiawei, Xie Lizhe, Gu Siqi, Qin Zihan, Zhang Yuning, Wang Zheng, Hu Yining
Southeast University, School of Cyber Science and Engineering, Nanjing, 210096, China.
Nanjing Medical University, Engineering Centre for Digital Medical Technology of Stomatology, Nanjing, 210029, China.
Sci Rep. 2025 Feb 12;15(1):5237. doi: 10.1038/s41598-025-89267-8.
Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples, significantly hindering the development of deep learning technologies in high-security domains. A key challenge is that current defense methods often lack universality, as they are effective only against certain types of adversarial attacks. This study addresses this challenge by focusing on analyzing adversarial examples through changes in model attention, and classifying attack algorithms into attention-shifting and attention-attenuation categories. Our main novelty lies in proposing two defense modules: the Feature Pyramid-based Attention Space-guided (FPAS) module to counter attention-shifting attacks, and the Attention-based Non-Local (ANL) module to mitigate attention-attenuation attacks. These modules enhance the model's defense capability with minimal intrusion into the original model. By integrating FPAS and ANL into the Wide-ResNet model within a boosting framework, we demonstrate their synergistic defense capability. Even when adversarial examples are embedded with patches, our models showed significant improvements over the baseline, enhancing the average defense rate by 5.47% and 7.74%, respectively. Extensive experiments confirm that this universal defense strategy offers comprehensive protection against adversarial attacks at a lower implementation cost compared to current mainstream defense methods, and is also adaptable for integration with existing defense strategies to further enhance adversarial robustness.
深度神经网络(DNN)已被证明容易受到对抗样本的攻击,这严重阻碍了深度学习技术在高安全领域的发展。一个关键挑战是,当前的防御方法往往缺乏通用性,因为它们仅对某些类型的对抗攻击有效。本研究通过关注模型注意力变化来分析对抗样本,并将攻击算法分为注意力转移和注意力衰减两类,从而应对这一挑战。我们的主要创新点在于提出了两个防御模块:基于特征金字塔的注意力空间引导(FPAS)模块,用于对抗注意力转移攻击;以及基于注意力的非局部(ANL)模块,用于减轻注意力衰减攻击。这些模块以最小程度侵入原始模型的方式增强了模型的防御能力。通过在增强框架中将FPAS和ANL集成到宽残差网络(Wide-ResNet)模型中,我们展示了它们的协同防御能力。即使对抗样本嵌入了补丁,我们的模型相对于基线也有显著改进,平均防御率分别提高了5.47%和7.74%。大量实验证实,与当前主流防御方法相比,这种通用防御策略以较低的实现成本提供了针对对抗攻击的全面保护,并且还适用于与现有防御策略集成,以进一步增强对抗鲁棒性。