Dong Yinpeng, Cheng Shuyu, Pang Tianyu, Su Hang, Zhu Jun
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9536-9548. doi: 10.1109/TPAMI.2021.3126733. Epub 2022 Nov 7.
Adversarial attacks have been extensively studied in recent years since they can identify the vulnerability of deep learning models before deployed. In this paper, we consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model. Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries. However, the existing methods inevitably suffer from low attack success rates or poor query efficiency since it is difficult to estimate the gradient in a high-dimensional input space with limited information. To address these problems and improve black-box attacks, we propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging, respectively. Our methods can take the advantage of a transfer-based prior given by the gradient of a surrogate model and the query information simultaneously. Through theoretical analyses, the transfer-based prior is appropriately integrated with model queries by an optimal coefficient in each method. Extensive experiments demonstrate that, in comparison with the alternative state-of-the-arts, both of our methods require much fewer queries to attack black-box models with higher success rates.
近年来,对抗攻击得到了广泛研究,因为它们可以在深度学习模型部署之前识别其漏洞。在本文中,我们考虑黑盒对抗设置,即对手需要在无法获取目标模型梯度的情况下生成对抗样本。先前的方法试图通过使用替代白盒模型的转移梯度或基于模型查询的反馈来近似真实梯度。然而,现有方法不可避免地存在攻击成功率低或查询效率差的问题,因为在信息有限的高维输入空间中很难估计梯度。为了解决这些问题并改进黑盒攻击,我们分别基于有偏采样和梯度平均提出了两种先验引导的无随机梯度(PRGF)算法。我们的方法可以同时利用替代模型梯度给出的基于转移的先验和查询信息。通过理论分析,在每种方法中,基于转移的先验通过一个最优系数与模型查询进行适当整合。大量实验表明,与其他现有技术相比,我们的两种方法攻击黑盒模型所需的查询次数要少得多,成功率更高。