可证明的无妥协于泛化性的无限制对抗训练。

Provable Unrestricted Adversarial Training Without Compromise With Generalizability.

作者信息

Zhang Lilin, Yang Ning, Sun Yanchao, Yu Philip S

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8302-8319. doi: 10.1109/TPAMI.2024.3400988. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3400988

Abstract

Adversarial training (AT) is widely considered as the most promising strategy to defend against adversarial attacks and has drawn increasing interest from researchers. However, the existing AT methods still suffer from two challenges. First, they are unable to handle unrestricted adversarial examples (UAEs), which are built from scratch, as opposed to restricted adversarial examples (RAEs), which are created by adding perturbations bound by an l norm to observed examples. Second, the existing AT methods often achieve adversarial robustness at the expense of standard generalizability (i.e., the accuracy on natural examples) because they make a tradeoff between them. To overcome these challenges, we propose a unique viewpoint that understands UAEs as imperceptibly perturbed unobserved examples. Also, we find that the tradeoff results from the separation of the distributions of adversarial examples and natural examples. Based on these ideas, we propose a novel AT approach called Provable Unrestricted Adversarial Training (PUAT), which can provide a target classifier with comprehensive adversarial robustness against both UAE and RAE, and simultaneously improve its standard generalizability. Particularly, PUAT utilizes partially labeled data to achieve effective UAE generation by accurately capturing the natural data distribution through a novel augmented triple-GAN. At the same time, PUAT extends the traditional AT by introducing the supervised loss of the target classifier into the adversarial loss and achieves the alignment between the UAE distribution, the natural data distribution, and the distribution learned by the classifier, with the collaboration of the augmented triple-GAN. Finally, the solid theoretical analysis and extensive experiments conducted on widely-used benchmarks demonstrate the superiority of PUAT.

摘要

对抗训练（AT）被广泛认为是抵御对抗攻击最有前景的策略，并且已引起研究人员越来越多的关注。然而，现有的AT方法仍然面临两个挑战。首先，它们无法处理完全从头构建的无限制对抗样本（UAE），与之相对的是受限对抗样本（RAE），后者是通过向观测样本添加由l范数约束的扰动而创建的。其次，现有的AT方法通常以牺牲标准泛化能力（即自然样本上的准确率）为代价来实现对抗鲁棒性，因为它们在两者之间进行了权衡。为了克服这些挑战，我们提出了一种独特的观点，即将UAE理解为难以察觉地受到扰动的未观测样本。此外，我们发现这种权衡是由对抗样本和自然样本的分布分离导致的。基于这些想法，我们提出了一种名为可证无限制对抗训练（PUAT）的新颖AT方法，它可以为目标分类器提供针对UAE和RAE的全面对抗鲁棒性，同时提高其标准泛化能力。具体而言，PUAT利用部分标记数据，通过一种新颖的增强型三生成对抗网络（triple-GAN）准确捕捉自然数据分布，从而实现有效的UAE生成。同时，PUAT通过将目标分类器的监督损失引入对抗损失来扩展传统的AT，并在增强型三生成对抗网络的协作下，实现UAE分布、自然数据分布和分类器学习到的分布之间的对齐。最后，在广泛使用的基准上进行的扎实理论分析和大量实验证明了PUAT的优越性。

相似文献

Provable Unrestricted Adversarial Training Without Compromise With Generalizability.可证明的无妥协于泛化性的无限制对抗训练。

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8302-8319. doi: 10.1109/TPAMI.2024.3400988. Epub 2024 Nov 6.

Evaluation of GAN-Based Model for Adversarial Training.基于 GAN 的对抗训练模型评估。

Sensors (Basel). 2023 Mar 1;23(5):2697. doi: 10.3390/s23052697.

Adversarial attacks and defenses using feature-space stochasticity.基于特征空间随机性的对抗攻击与防御。

Neural Netw. 2023 Oct;167:875-889. doi: 10.1016/j.neunet.2023.08.022. Epub 2023 Aug 21.

Toward Intrinsic Adversarial Robustness Through Probabilistic Training.通过概率训练实现内在对抗鲁棒性。

IEEE Trans Image Process. 2023;32:3862-3872. doi: 10.1109/TIP.2023.3290532. Epub 2023 Jul 14.

Adversarial Robustness of Deep Reinforcement Learning Based Dynamic Recommender Systems.基于深度强化学习的动态推荐系统的对抗鲁棒性

Front Big Data. 2022 May 3;5:822783. doi: 10.3389/fdata.2022.822783. eCollection 2022.

Robustness meets accuracy in adversarial training for graph autoencoder.图自动编码器对抗训练中的鲁棒性与准确性

Neural Netw. 2023 Jan;157:114-124. doi: 10.1016/j.neunet.2022.10.010. Epub 2022 Oct 20.

Boosting adversarial robustness via self-paced adversarial training.通过自步对抗训练提高对抗鲁棒性。

Neural Netw. 2023 Oct;167:706-714. doi: 10.1016/j.neunet.2023.08.063. Epub 2023 Sep 9.

Diffusion Models for Imperceptible and Transferable Adversarial Attack.用于不可察觉和可转移对抗攻击的扩散模型

IEEE Trans Pattern Anal Mach Intell. 2025 Feb;47(2):961-977. doi: 10.1109/TPAMI.2024.3480519. Epub 2025 Jan 9.

Revisiting the Trade-Off Between Accuracy and Robustness via Weight Distribution of Filters.通过滤波器的权重分布重新审视准确性与鲁棒性之间的权衡。

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8870-8882. doi: 10.1109/TPAMI.2024.3411035. Epub 2024 Nov 6.

Perturbation diversity certificates robust generalization.摄动多样性证书保证了强健的泛化能力。

Neural Netw. 2024 Apr;172:106117. doi: 10.1016/j.neunet.2024.106117. Epub 2024 Jan 8.

可证明的无妥协于泛化性的无限制对抗训练。

Provable Unrestricted Adversarial Training Without Compromise With Generalizability.

作者信息

Zhang Lilin, Yang Ning, Sun Yanchao, Yu Philip S

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8302-8319. doi: 10.1109/TPAMI.2024.3400988. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3400988

PMID:38743549

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

可证明的无妥协于泛化性的无限制对抗训练。

Provable Unrestricted Adversarial Training Without Compromise With Generalizability.

作者信息

出版信息

相似文献

可证明的无妥协于泛化性的无限制对抗训练。

Provable Unrestricted Adversarial Training Without Compromise With Generalizability.

作者信息

出版信息

相似文献