IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):9041-9054. doi: 10.1109/TPAMI.2022.3231886. Epub 2023 Jun 5.
Adversarial patch is an important form of real-world adversarial attack that brings serious risks to the robustness of deep neural networks. Previous methods generate adversarial patches by either optimizing their perturbation values while fixing the pasting position or manipulating the position while fixing the patch's content. This reveals that the positions and perturbations are both important to the adversarial attack. For that, in this article, we propose a novel method to simultaneously optimize the position and perturbation for an adversarial patch, and thus obtain a high attack success rate in the black-box setting. Technically, we regard the patch's position, the pre-designed hyper-parameters to determine the patch's perturbations as the variables, and utilize the reinforcement learning framework to simultaneously solve for the optimal solution based on the rewards obtained from the target model with a small number of queries. Extensive experiments are conducted on the Face Recognition (FR) task, and results on four representative FR models show that our method can significantly improve the attack success rate and query efficiency. Besides, experiments on the commercial FR service and physical environments confirm its practical application value. We also extend our method to the traffic sign recognition task to verify its generalization ability.
对抗补丁是一种重要的真实世界对抗攻击形式,它给深度神经网络的鲁棒性带来了严重的风险。以前的方法通过在固定粘贴位置的同时优化其扰动值,或者在固定补丁内容的同时操纵位置来生成对抗补丁。这表明位置和扰动对对抗攻击都很重要。为此,本文提出了一种新的方法,用于同时优化对抗补丁的位置和扰动,从而在黑盒设置中获得高攻击成功率。从技术上讲,我们将补丁的位置、预先设计的超参数来确定补丁的扰动作为变量,并利用强化学习框架,根据从少量查询的目标模型获得的奖励,同时求解基于最优解。在人脸识别 (FR) 任务上进行了广泛的实验,对四个代表性的 FR 模型的实验结果表明,我们的方法可以显著提高攻击成功率和查询效率。此外,商业 FR 服务和物理环境中的实验证实了其实际应用价值。我们还将该方法扩展到交通标志识别任务,以验证其泛化能力。