Chen Yuling, Yang Hao, Wang Xuewei, Wang Qi, Zhou Huiyu
The State Key Laboratory of Public Big Data and College of Computer Science and Technology, University of Guizhou, Guiyang 550025, China.
Computer College, Weifang University of Science and Technology, Weifang 261000, China.
Entropy (Basel). 2023 Mar 6;25(3):461. doi: 10.3390/e25030461.
The adversarial attack is crucial to improving the robustness of deep learning models; they help improve the interpretability of deep learning and also increase the security of the models in real-world applications. However, existing attack algorithms mainly focus on image classification tasks, and they lack research targeting object detection. Adversarial attacks against image classification are global-based with no focus on the intrinsic features of the image. In other words, they generate perturbations that cover the whole image, and each added perturbation is quantitative and undifferentiated. In contrast, we propose a global-to-local adversarial attack based on object detection, which destroys important perceptual features of the object. More specifically, we differentially extract gradient features as a proportion of perturbation additions to generate adversarial samples, as the magnitude of the gradient is highly correlated with the model's point of interest. In addition, we reduce unnecessary perturbations by dynamically suppressing excessive perturbations to generate high-quality adversarial samples. After that, we improve the effectiveness of the attack using the high-frequency feature gradient as a motivation to guide the next gradient attack. Numerous experiments and evaluations have demonstrated the effectiveness and superior performance of our from global to Local gradient attacks with high-frequency momentum guidance (GLH), which is more effective than previous attacks. Our generated adversarial samples also have excellent black-box attack ability.
对抗攻击对于提高深度学习模型的鲁棒性至关重要;它们有助于提高深度学习的可解释性,并在实际应用中增强模型的安全性。然而,现有的攻击算法主要集中在图像分类任务上,缺乏针对目标检测的研究。针对图像分类的对抗攻击是基于全局的,没有关注图像的内在特征。换句话说,它们生成覆盖整个图像的扰动,并且每个添加的扰动都是定量且无差别的。相比之下,我们提出了一种基于目标检测的全局到局部对抗攻击,它会破坏目标的重要感知特征。更具体地说,我们将梯度特征作为扰动添加比例进行差分提取以生成对抗样本,因为梯度的大小与模型的关注要点高度相关。此外,我们通过动态抑制过多的扰动来减少不必要的扰动,以生成高质量的对抗样本。之后,我们以高频特征梯度为动力来指导下一次梯度攻击,从而提高攻击的有效性。大量的实验和评估证明了我们的具有高频动量引导的从全局到局部梯度攻击(GLH)的有效性和卓越性能,它比以前的攻击更有效。我们生成的对抗样本也具有出色的黑盒攻击能力。