Google, Mountain View, CA, USA.
Department of Cognitive Science, University of California, San Diego, CA, USA.
Nat Commun. 2023 Aug 15;14(1):4933. doi: 10.1038/s41467-023-40499-0.
Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.
尽管人工神经网络 (ANNs) 受到大脑的启发,但它们表现出的脆性在人类感知中并不常见。ANNs 的一个缺点是它们容易受到对抗性扰动的影响——这些扰动是对自然图像的细微调制,会导致分类决策发生变化,例如自信地将最初被正确分类为大象的图像错误地标为时钟。相比之下,人类观察者可能会认为这些扰动是无害的成像伪影。这种现象可能表明人类和机器感知之间存在根本差异,但它促使人们思考是否可以通过适当的行为测量来揭示人类对对抗性扰动的敏感性。在这里,我们发现,欺骗 ANN 的对抗性扰动同样会影响人类的选择。我们进一步表明,这种效应更可能是由人类和 ANN 都敏感的自然图像的高阶统计驱动的,而不是由 ANN 的详细结构驱动的。