Zhao Jie-Chao, Ding Jin, Sun Yong-Zhi, Tan Ping, Ma Ji-En, Fang You-Tong
School of Automation and Electrical Engineering & Key Institute of Robotics of Zhejiang Province, Zhejiang University of Science and Technology, Hangzhou, China.
State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou, China.
PLoS One. 2025 Jan 7;20(1):e0317023. doi: 10.1371/journal.pone.0317023. eCollection 2025.
Adversarial training has become a primary method for enhancing the robustness of deep learning models. In recent years, fast adversarial training methods have gained widespread attention due to their lower computational cost. However, since fast adversarial training uses single-step adversarial attacks instead of multi-step attacks, the generated adversarial examples lack diversity, making models prone to catastrophic overfitting and loss of robustness. Existing methods to prevent catastrophic overfitting have certain shortcomings, such as poor robustness due to insufficient strength of generated adversarial examples, and low accuracy caused by excessive total perturbation. To address these issues, this paper proposes a fast adversarial training method-fast adversarial training with adaptive similarity step size (ATSS). In this method, random noise is first added to the input clean samples, and the model then calculates the gradient for each input sample. The perturbation step size for each sample is determined based on the similarity between the input noise and the gradient direction. Finally, adversarial examples are generated based on the step size and gradient for adversarial training. We conduct various adversarial attack tests on ResNet18 and VGG19 models using the CIFAR-10, CIFAR-100 and Tiny ImageNet datasets. The experimental results demonstrate that our method effectively avoids catastrophic overfitting. And compared to other fast adversarial training methods, ATSS achieves higher robustness accuracy and clean accuracy, with almost no additional training cost.
对抗训练已成为增强深度学习模型鲁棒性的主要方法。近年来,快速对抗训练方法因其较低的计算成本而受到广泛关注。然而,由于快速对抗训练使用单步对抗攻击而非多步攻击,生成的对抗样本缺乏多样性,导致模型容易出现灾难性过拟合并丧失鲁棒性。现有的防止灾难性过拟合的方法存在一定缺陷,例如由于生成的对抗样本强度不足导致鲁棒性较差,以及由于总扰动过大导致准确率较低。为了解决这些问题,本文提出了一种快速对抗训练方法——自适应相似性步长快速对抗训练(ATSS)。在该方法中,首先将随机噪声添加到输入的干净样本中,然后模型为每个输入样本计算梯度。每个样本的扰动步长根据输入噪声与梯度方向之间的相似性来确定。最后,基于步长和梯度生成对抗样本用于对抗训练。我们使用CIFAR-10、CIFAR-100和Tiny ImageNet数据集对ResNet18和VGG19模型进行了各种对抗攻击测试。实验结果表明,我们的方法有效地避免了灾难性过拟合。并且与其他快速对抗训练方法相比,ATSS实现了更高的鲁棒性准确率和干净准确率,且几乎没有额外的训练成本。