Xiao Jiancong, Sun Ruoyu, Long Qi, Su Weijie J
University of Pennsylvania.
The Chinese University of Hong Kong, Shenzhen.
Proc Mach Learn Res. 2024;247:5074-5075.
Training Deep Neural Networks (DNNs) with adversarial examples often results in poor generalization to test-time adversarial data. This paper investigates this issue, known as adversarially robust generalization, through the lens of Rademacher complexity. Building upon the studies by Khim and Loh (2018); Yin et al. (2019), numerous works have been dedicated to this problem, yet achieving a satisfactory bound remains an elusive goal. Existing works on DNNs either apply to a surrogate loss instead of the robust loss or yield bounds that are notably looser compared to their standard counterparts. In the latter case, the bounds have a higher dependency on the width of the DNNs or the dimension of the data, with an extra factor of at least or . This paper presents upper bounds for adversarial Rademacher complexity of DNNs that match the best-known upper bounds in standard settings, as established in the work of Bartlett et al. (2017), with the dependency on width and dimension being . The central challenge addressed is calculating the covering number of adversarial function classes. We aim to construct a new cover that possesses two properties: 1) compatibility with adversarial examples, and 2) precision comparable to covers used in standard settings. To this end, we introduce a new variant of covering number called the , specifically designed and proven to reconcile these two properties. Consequently, our method effectively bridges the gap between Rademacher complexity in robust and standard generalization.
使用对抗性示例训练深度神经网络(DNN)通常会导致在测试时对抗性数据上的泛化能力较差。本文通过拉德马赫复杂度的视角研究了这个被称为对抗性鲁棒泛化的问题。在Khim和Loh(2018年);Yin等人(2019年)的研究基础上,许多工作都致力于这个问题,但实现一个令人满意的界限仍然是一个难以实现的目标。现有的关于DNN的工作要么适用于替代损失而不是鲁棒损失,要么得出的界限与它们的标准对应物相比明显更宽松。在后一种情况下,界限对DNN的宽度或数据的维度有更高的依赖性,有一个至少为 或 的额外因子。本文给出了DNN的对抗性拉德马赫复杂度的上界,该上界与Bartlett等人(2017年)的工作中在标准设置下建立的最著名的上界相匹配,对宽度和维度的依赖性为 。所解决的核心挑战是计算对抗性函数类的覆盖数。我们的目标是构建一个具有两个属性的新覆盖:1)与对抗性示例兼容,2)精度与标准设置中使用的覆盖相当。为此,我们引入了一种称为 的覆盖数的新变体,专门设计并证明它能协调这两个属性。因此,我们的方法有效地弥合了鲁棒泛化和标准泛化中拉德马赫复杂度之间的差距。