Suppr超能文献

针对由非冗余先验驱动的深度显著性模型的对抗攻击。

Adversarial Attack Against Deep Saliency Models Powered by Non-Redundant Priors.

作者信息

Che Zhaohui, Borji Ali, Zhai Guangtao, Ling Suiyi, Li Jing, Tian Yuan, Guo Guodong, Le Callet Patrick

出版信息

IEEE Trans Image Process. 2021;30:1973-1988. doi: 10.1109/TIP.2021.3050303. Epub 2021 Jan 20.

Abstract

Saliency detection is an effective front-end process to many security-related tasks, e.g. automatic drive and tracking. Adversarial attack serves as an efficient surrogate to evaluate the robustness of deep saliency models before they are deployed in real world. However, most of current adversarial attacks exploit the gradients spanning the entire image space to craft adversarial examples, ignoring the fact that natural images are high-dimensional and spatially over-redundant, thus causing expensive attack cost and poor perceptibility. To circumvent these issues, this paper builds an efficient bridge between the accessible partially-white-box source models and the unknown black-box target models. The proposed method includes two steps: 1) We design a new partially-white-box attack, which defines the cost function in the compact hidden space to punish a fraction of feature activations corresponding to the salient regions, instead of punishing every pixel spanning the entire dense output space. This partially-white-box attack reduces the redundancy of the adversarial perturbation. 2) We exploit the non-redundant perturbations from some source models as the prior cues, and use an iterative zeroth-order optimizer to compute the directional derivatives along the non-redundant prior directions, in order to estimate the actual gradient of the black-box target model. The non-redundant priors boost the update of some "critical" pixels locating at non-zero coordinates of the prior cues, while keeping other redundant pixels locating at the zero coordinates unaffected. Our method achieves the best tradeoff between attack ability and perturbation redundancy. Finally, we conduct a comprehensive experiment to test the robustness of 18 state-of-the-art deep saliency models against 16 malicious attacks, under both of white-box and black-box settings, which contributes a new robustness benchmark to the saliency community for the first time.

摘要

显著性检测是许多与安全相关任务(如自动驾驶和跟踪)的有效前端处理过程。对抗攻击是一种有效的替代方法,用于在深度显著性模型部署到现实世界之前评估其鲁棒性。然而,当前大多数对抗攻击利用跨越整个图像空间的梯度来生成对抗样本,忽略了自然图像是高维且空间上过度冗余的这一事实,从而导致攻击成本高昂且可感知性差。为了规避这些问题,本文在可访问的部分白盒源模型和未知的黑盒目标模型之间建立了一座有效的桥梁。所提出的方法包括两个步骤:1)我们设计了一种新的部分白盒攻击,它在紧凑的隐藏空间中定义成本函数,以惩罚与显著区域相对应的一部分特征激活,而不是惩罚跨越整个密集输出空间的每个像素。这种部分白盒攻击减少了对抗扰动的冗余性。2)我们利用一些源模型的非冗余扰动作为先验线索,并使用迭代零阶优化器沿着非冗余先验方向计算方向导数,以估计黑盒目标模型的实际梯度。非冗余先验促进了位于先验线索非零坐标处的一些“关键”像素的更新,同时保持位于零坐标处的其他冗余像素不受影响。我们的方法在攻击能力和扰动冗余之间实现了最佳平衡。最后,我们进行了一项全面的实验,在白盒和黑盒设置下,测试了18个最先进的深度显著性模型对16种恶意攻击的鲁棒性,这首次为显著性社区贡献了一个新的鲁棒性基准。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验