Suppr超能文献

深度卷积神经网络和人类视觉皮层中物体形状和类别的正交表示。

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex.

机构信息

Department of Brain and Cognition & Leuven Brain Institute, KU Leuven, Leuven, Belgium.

Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy.

出版信息

Sci Rep. 2020 Feb 12;10(1):2453. doi: 10.1038/s41598-020-59175-0.

Abstract

Deep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of visual object recognition, with performance now surpassing humans. While CNNs can accurately assign one image to potentially thousands of categories, network performance could be the result of layers that are tuned to represent the visual shape of objects, rather than object category, since both are often confounded in natural images. Using two stimulus sets that explicitly dissociate shape from category, we correlate these two types of information with each layer of multiple CNNs. We also compare CNN output with fMRI activation along the human visual ventral stream by correlating artificial with neural representations. We find that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures. Comparing CNNs with fMRI brain data, early visual cortex (V1) and early layers of CNNs encode shape information. Anterior ventral temporal cortex encodes category information, which correlates best with the final layer of CNNs. The interaction between shape and category that is found along the human visual ventral pathway is echoed in multiple deep networks. Our results suggest CNNs represent category information independently from shape, much like the human visual system.

摘要

深度卷积神经网络 (CNNs) 作为视觉目标识别的基准模型,其性能已经超越了人类。虽然 CNN 可以准确地将一张图像分配给数千个潜在的类别,但网络性能可能是由于网络层被调谐为代表物体的视觉形状,而不是物体类别,因为这两者在自然图像中通常是混淆的。我们使用两个明确将形状与类别分开的刺激集,将这两种信息与多个 CNN 的每一层进行相关联。我们还通过将人工表示与神经表示相关联,将 CNN 输出与 fMRI 激活沿着人类视觉腹侧流进行比较。我们发现,CNN 独立于形状编码类别信息,在所有测试的 CNN 架构中,最终的全连接层达到峰值。与 fMRI 大脑数据进行比较时,初级视觉皮层 (V1) 和 CNN 的早期层编码形状信息。前腹侧颞叶皮层编码类别信息,与 CNN 的最后一层相关性最好。在人类视觉腹侧通路中发现的形状和类别之间的相互作用,在多个深度网络中得到了呼应。我们的研究结果表明,CNN 独立于形状表示类别信息,这与人类视觉系统非常相似。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ded/7016009/b26252b6503d/41598_2020_59175_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验