Suppr超能文献

用于不可察觉和可转移对抗攻击的扩散模型

Diffusion Models for Imperceptible and Transferable Adversarial Attack.

作者信息

Chen Jianqi, Chen Hao, Chen Keyan, Zhang Yilan, Zou Zhengxia, Shi Zhenwei

出版信息

IEEE Trans Pattern Anal Mach Intell. 2025 Feb;47(2):961-977. doi: 10.1109/TPAMI.2024.3480519. Epub 2025 Jan 9.

Abstract

Many existing adversarial attacks generate -norm perturbations on image RGB space. Despite some achievements in transferability and attack success rate, the crafted adversarial examples are easily perceived by human eyes. Towards visual imperceptibility, some recent works explore unrestricted attacks without -norm constraints, yet lacking transferability of attacking black-box models. In this work, we propose a novel imperceptible and transferable attack by leveraging both the generative and discriminative power of diffusion models. Specifically, instead of direct manipulation in pixel space, we craft perturbations in the latent space of diffusion models. Combined with well-designed content-preserving structures, we can generate human-insensitive perturbations embedded with semantic clues. For better transferability, we further "deceive" the diffusion model which can be viewed as an implicit recognition surrogate, by distracting its attention away from the target regions. To our knowledge, our proposed method, DiffAttack, is the first that introduces diffusion models into the adversarial attack field. Extensive experiments conducted across diverse model architectures (CNNs, Transformers, and MLPs), datasets (ImageNet, CUB-200, and Standford Cars), and defense mechanisms underscore the superiority of our attack over existing methods such as iterative attacks, GAN-based attacks, and ensemble attacks. Furthermore, we provide a comprehensive discussion on future research avenues in diffusion-based adversarial attacks, aiming to chart a course for this burgeoning field.

摘要

许多现有的对抗攻击在图像RGB空间上生成 -范数扰动。尽管在可迁移性和攻击成功率方面取得了一些成果,但精心制作的对抗样本很容易被人眼察觉。为了实现视觉上的不可察觉性,一些近期的工作探索了无 -范数约束的无限制攻击,但缺乏攻击黑盒模型的可迁移性。在这项工作中,我们通过利用扩散模型的生成能力和判别能力,提出了一种新颖的不可察觉且可迁移的攻击方法。具体而言,我们不是在像素空间中直接进行操作,而是在扩散模型的潜在空间中制作扰动。结合精心设计的内容保留结构,我们可以生成嵌入语义线索的对人类不敏感的扰动。为了获得更好的可迁移性,我们进一步“欺骗”扩散模型,该模型可被视为一种隐式识别替代模型,方法是将其注意力从目标区域转移开。据我们所知,我们提出的方法DiffAttack是首个将扩散模型引入对抗攻击领域的方法。在各种模型架构(卷积神经网络、Transformer和多层感知器)、数据集(ImageNet、CUB - 200和斯坦福汽车数据集)以及防御机制上进行的广泛实验强调了我们的攻击方法相对于现有方法(如迭代攻击、基于生成对抗网络的攻击和集成攻击)的优越性。此外,我们对基于扩散的对抗攻击的未来研究方向进行了全面讨论,旨在为这个新兴领域指明方向。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验