深度卷积神经网络和人类视觉皮层中物体形状和类别的正交表示。

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex.

机构信息

Department of Brain and Cognition & Leuven Brain Institute, KU Leuven, Leuven, Belgium.

Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy.

出版信息

Sci Rep. 2020 Feb 12;10(1):2453. doi: 10.1038/s41598-020-59175-0.

DOI:10.1038/s41598-020-59175-0

PMID:32051467

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7016009/

Abstract

Deep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of visual object recognition, with performance now surpassing humans. While CNNs can accurately assign one image to potentially thousands of categories, network performance could be the result of layers that are tuned to represent the visual shape of objects, rather than object category, since both are often confounded in natural images. Using two stimulus sets that explicitly dissociate shape from category, we correlate these two types of information with each layer of multiple CNNs. We also compare CNN output with fMRI activation along the human visual ventral stream by correlating artificial with neural representations. We find that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures. Comparing CNNs with fMRI brain data, early visual cortex (V1) and early layers of CNNs encode shape information. Anterior ventral temporal cortex encodes category information, which correlates best with the final layer of CNNs. The interaction between shape and category that is found along the human visual ventral pathway is echoed in multiple deep networks. Our results suggest CNNs represent category information independently from shape, much like the human visual system.

摘要

深度卷积神经网络 (CNNs) 作为视觉目标识别的基准模型，其性能已经超越了人类。虽然 CNN 可以准确地将一张图像分配给数千个潜在的类别，但网络性能可能是由于网络层被调谐为代表物体的视觉形状，而不是物体类别，因为这两者在自然图像中通常是混淆的。我们使用两个明确将形状与类别分开的刺激集，将这两种信息与多个 CNN 的每一层进行相关联。我们还通过将人工表示与神经表示相关联，将 CNN 输出与 fMRI 激活沿着人类视觉腹侧流进行比较。我们发现，CNN 独立于形状编码类别信息，在所有测试的 CNN 架构中，最终的全连接层达到峰值。与 fMRI 大脑数据进行比较时，初级视觉皮层 (V1) 和 CNN 的早期层编码形状信息。前腹侧颞叶皮层编码类别信息，与 CNN 的最后一层相关性最好。在人类视觉腹侧通路中发现的形状和类别之间的相互作用，在多个深度网络中得到了呼应。我们的研究结果表明，CNN 独立于形状表示类别信息，这与人类视觉系统非常相似。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ded/7016009/b26252b6503d/41598_2020_59175_Fig1_HTML.jpg

相似文献

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex.深度卷积神经网络和人类视觉皮层中物体形状和类别的正交表示。

Sci Rep. 2020 Feb 12;10(1):2453. doi: 10.1038/s41598-020-59175-0.

Examining the Coding Strength of Object Identity and Nonidentity Features in Human Occipito-Temporal Cortex and Convolutional Neural Networks.检查人类枕颞叶皮层和卷积神经网络中对象身份和非身份特征的编码强度。

J Neurosci. 2021 May 12;41(19):4234-4252. doi: 10.1523/JNEUROSCI.1993-20.2021. Epub 2021 Mar 31.

Dissociations and Associations between Shape and Category Representations in the Two Visual Pathways.两条视觉通路中形状与类别表征之间的分离和关联

J Neurosci. 2016 Jan 13;36(2):432-44. doi: 10.1523/JNEUROSCI.2314-15.2016.

The Ventral Visual Pathway Represents Animal Appearance over Animacy, Unlike Human Behavior and Deep Neural Networks.腹侧视觉通路代表动物的外观而不是能动性，与人类行为和深度神经网络不同。

J Neurosci. 2019 Aug 14;39(33):6513-6525. doi: 10.1523/JNEUROSCI.1714-18.2019. Epub 2019 Jun 13.

Texture-like representation of objects in human visual cortex.物体在人视觉皮层中的纹理状表现。

Proc Natl Acad Sci U S A. 2022 Apr 26;119(17):e2115302119. doi: 10.1073/pnas.2115302119. Epub 2022 Apr 19.

Multivariate Patterns in the Human Object-Processing Pathway Reveal a Shift from Retinotopic to Shape Curvature Representations in Lateral Occipital Areas, LO-1 and LO-2.人类物体处理通路中的多变量模式揭示了枕叶外侧区域LO-1和LO-2中从视网膜拓扑表征到形状曲率表征的转变。

J Neurosci. 2016 May 25;36(21):5763-74. doi: 10.1523/JNEUROSCI.3603-15.2016.

Color-Biased Regions of the Ventral Visual Pathway Lie between Face- and Place-Selective Regions in Humans, as in Macaques.与猕猴一样，人类腹侧视觉通路的颜色偏好区域位于面部和位置选择性区域之间。

J Neurosci. 2016 Feb 3;36(5):1682-97. doi: 10.1523/JNEUROSCI.3164-15.2016.

Disentangling Representations of Object Shape and Object Category in Human Visual Cortex: The Animate-Inanimate Distinction.解析人类视觉皮层中物体形状与物体类别的表征：有生命与无生命的区分

J Cogn Neurosci. 2016 May;28(5):680-92. doi: 10.1162/jocn_a_00924. Epub 2016 Jan 14.

Predicting Identity-Preserving Object Transformations across the Human Ventral Visual Stream.预测人类腹侧视觉流中的保持身份的物体转换。

J Neurosci. 2021 Sep 1;41(35):7403-7419. doi: 10.1523/JNEUROSCI.2137-20.2021. Epub 2021 Jul 12.

Category-selective patterns of neural response in the ventral visual pathway in the absence of categorical information.在缺乏类别信息的情况下，腹侧视觉通路中神经反应的类别选择性模式。

Neuroimage. 2016 Jul 15;135:107-14. doi: 10.1016/j.neuroimage.2016.04.060. Epub 2016 Apr 28.

引用本文的文献

Coordinating multiple mental faculties during learning.学习过程中协调多种心理能力。

Sci Rep. 2025 Feb 13;15(1):5319. doi: 10.1038/s41598-025-89732-4.

A computational deep learning investigation of animacy perception in the human brain.对人类大脑中生物运动感知的计算深度学习研究。

Commun Biol. 2024 Dec 31;7(1):1718. doi: 10.1038/s42003-024-07415-8.

Graspable foods and tools elicit similar responses in visual cortex.可抓握的食物和工具在视觉皮层中引起类似的反应。

Cereb Cortex. 2024 Sep 3;34(9). doi: 10.1093/cercor/bhae383.

Fine-grained knowledge about manipulable objects is well-predicted by contrastive language image pre-training.关于可操纵物体的细粒度知识可以通过对比语言图像预训练得到很好的预测。

iScience. 2024 Jun 17;27(7):110297. doi: 10.1016/j.isci.2024.110297. eCollection 2024 Jul 19.

Neuronal tuning and population representations of shape and category in human visual cortex.人类视觉皮层中形状和类别的神经元调谐和群体表现。

Nat Commun. 2024 May 30;15(1):4608. doi: 10.1038/s41467-024-49078-3.

Graspable foods and tools elicit similar responses in visual cortex.可抓取的食物和工具在视觉皮层中引发相似的反应。

bioRxiv. 2024 Feb 22:2024.02.20.581258. doi: 10.1101/2024.02.20.581258.

Representation of Natural Contours by a Neural Population in Monkey V4.猴子 V4 中神经元群体对自然轮廓的表示。

eNeuro. 2024 Mar 15;11(3). doi: 10.1523/ENEURO.0445-23.2024. Print 2024 Mar.

A computationally informed comparison between the strategies of rodents and humans in visual object recognition.基于计算信息的啮齿动物和人类在视觉物体识别策略上的比较。

Elife. 2023 Dec 11;12:RP87719. doi: 10.7554/eLife.87719.

Deep learning in neuroimaging data analysis: Applications, challenges, and solutions.神经影像数据分析中的深度学习：应用、挑战与解决方案。

Front Neuroimaging. 2022 Oct 26;1:981642. doi: 10.3389/fnimg.2022.981642. eCollection 2022.

Neuroimage. 2023 Aug 15;277:120220. doi: 10.1016/j.neuroimage.2023.120220. Epub 2023 Jun 14.

本文引用的文献

Using neural distance to predict reaction time for categorizing the animacy, shape, and abstract properties of objects.利用神经距离预测对物体的生物性、形状和抽象属性进行分类的反应时间。

Sci Rep. 2019 Sep 13;9(1):13201. doi: 10.1038/s41598-019-49732-7.

J Neurosci. 2019 Aug 14;39(33):6513-6525. doi: 10.1523/JNEUROSCI.1714-18.2019. Epub 2019 Jun 13.

Deep convolutional networks do not classify based on global object shape.深度卷积网络不是基于全局物体形状进行分类的。

PLoS Comput Biol. 2018 Dec 7;14(12):e1006613. doi: 10.1371/journal.pcbi.1006613. eCollection 2018 Dec.

Representations of regular and irregular shapes by deep Convolutional Neural Networks, monkey inferotemporal neurons and human judgments.深度卷积神经网络、猴子下颞叶神经元和人类判断对规则和不规则形状的表示。

PLoS Comput Biol. 2018 Oct 26;14(10):e1006557. doi: 10.1371/journal.pcbi.1006557. eCollection 2018 Oct.

Integrated deep visual and semantic attractor neural networks predict fMRI pattern-information along the ventral object processing pathway.整合深度视觉和语义吸引子神经网络可预测腹侧物体处理通路中的 fMRI 模式信息。

Sci Rep. 2018 Jul 13;8(1):10636. doi: 10.1038/s41598-018-28865-1.

On the partnership between neural representations of object categories and visual features in the ventral visual pathway.物体类别神经表象与腹侧视觉通路上视觉特征之间的关系。

Neuropsychologia. 2017 Oct;105:153-164. doi: 10.1016/j.neuropsychologia.2017.06.010. Epub 2017 Jun 12.

Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing.深度神经网络：一种用于模拟生物视觉和大脑信息处理的新框架。

Annu Rev Vis Sci. 2015 Nov 24;1:417-446. doi: 10.1146/annurev-vision-082114-035447.

Representational Distance Learning for Deep Neural Networks.深度神经网络的代表性距离学习

Front Comput Neurosci. 2016 Dec 27;10:131. doi: 10.3389/fncom.2016.00131. eCollection 2016.

Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.将深度神经网络与人类视觉物体识别的时空皮层动力学进行比较，揭示了层级对应关系。

Sci Rep. 2016 Jun 10;6:27755. doi: 10.1038/srep27755.

Deep Neural Networks as a Computational Model for Human Shape Sensitivity.深度神经网络作为人类形状敏感度的计算模型

PLoS Comput Biol. 2016 Apr 28;12(4):e1004896. doi: 10.1371/journal.pcbi.1004896. eCollection 2016 Apr.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度卷积神经网络和人类视觉皮层中物体形状和类别的正交表示。

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献