Department of Physiology, The University of Tokyo School of Medicine, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan.
Department of Physiology, The University of Tokyo School of Medicine, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan; International Research Center for Neurointelligence (WPI-IRCN), 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan; Institute for AI and Beyond, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan.
Neural Netw. 2023 Oct;167:875-889. doi: 10.1016/j.neunet.2023.08.022. Epub 2023 Aug 21.
Recent studies in deep neural networks have shown that injecting random noise in the input layer of the networks contributes towards ℓ-norm-bounded adversarial perturbations. However, to defend against unrestricted adversarial examples, most of which are not ℓ-norm-bounded in the input layer, such input-layer random noise may not be sufficient. In the first part of this study, we generated a novel class of unrestricted adversarial examples termed feature-space adversarial examples. These examples are far from the original data in the input space but adjacent to the original data in a hidden-layer feature space and far again in the output layer. In the second part of this study, we empirically showed that while injecting random noise in the input layer was unable to defend these feature-space adversarial examples, they were defended by injecting random noise in the hidden layer. These results highlight the novel benefit of stochasticity in higher layers, in that it is useful for defending against these feature-space adversarial examples, a class of unrestricted adversarial examples.
最近的深度神经网络研究表明,在网络的输入层中注入随机噪声有助于 ℓ-norm 有界对抗性扰动。然而,为了防御不受限制的对抗性示例,其中大多数在输入层中不是 ℓ-norm 有界的,这种输入层随机噪声可能是不够的。在本研究的第一部分,我们生成了一类新的称为特征空间对抗性示例的不受限制的对抗性示例。这些示例在输入空间中与原始数据相去甚远,但在隐藏层特征空间中与原始数据相邻,而在输出层中则更远。在本研究的第二部分,我们通过实验表明,尽管在输入层注入随机噪声无法防御这些特征空间对抗性示例,但在隐藏层注入随机噪声可以防御这些示例。这些结果突出了高层随机性的新优势,即它对于防御这些特征空间对抗性示例,即一类不受限制的对抗性示例是有用的。