Fan Jinyu, Zeng Yi
Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
Patterns (N Y). 2023 Feb 28;4(3):100695. doi: 10.1016/j.patter.2023.100695. eCollection 2023 Mar 10.
Even state-of-the-art deep learning models lack fundamental abilities compared with humans. While many image distortions have been proposed to compare deep learning with humans, they depend on mathematical transformations instead of human cognitive functions. Here, we propose an image distortion based on the abutting grating illusion, which is a phenomenon discovered in humans and animals. The distortion generates illusory contour perception using line gratings abutting each other. We applied the method to MNIST, high-resolution MNIST, and "16-class-ImageNet" silhouettes. Many models, including models trained from scratch and 109 models pretrained with ImageNet or various data augmentation techniques, were tested. Our results show that abutting grating distortion is challenging even for state-of-the-art deep learning models. We discovered that DeepAugment models outperformed other pretrained models. Visualization of early layers indicates that better-performing models exhibit the endstopping property, which is consistent with neuroscience discoveries. Twenty-four human subjects classified distorted samples to validate the distortion.
与人类相比,即使是最先进的深度学习模型也缺乏基本能力。虽然已经提出了许多图像失真方法来比较深度学习和人类,但它们依赖于数学变换而非人类认知功能。在此,我们提出了一种基于邻接光栅错觉的图像失真方法,邻接光栅错觉是在人类和动物身上发现的一种现象。这种失真利用相互邻接的线光栅产生虚幻轮廓感知。我们将该方法应用于MNIST、高分辨率MNIST和“16类ImageNet”剪影。测试了许多模型,包括从零开始训练的模型以及用ImageNet或各种数据增强技术预训练的109个模型。我们的结果表明,即使对于最先进的深度学习模型,邻接光栅失真也具有挑战性。我们发现DeepAugment模型优于其他预训练模型。早期层的可视化表明,性能更好的模型表现出终端停止特性,这与神经科学发现一致。24名人类受试者对失真样本进行分类以验证这种失真。