人类大脑和卷积神经网络中对多个视觉对象的表征

Representing Multiple Visual Objects in the Human Brain and Convolutional Neural Networks.

作者信息

Mocz Viola, Jeong Su Keun, Chun Marvin, Xu Yaoda

机构信息

Visual Cognitive Neuroscience Lab, Department of Psychology, Yale University, New Haven, CT 06520, USA.

Department of Psychology, Chungbuk National University, South Korea.

出版信息

bioRxiv. 2023 Mar 1:2023.02.28.530472. doi: 10.1101/2023.02.28.530472.

DOI:10.1101/2023.02.28.530472

PMID:36909506

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10002658/

Abstract

Objects in the real world often appear with other objects. To recover the identity of an object whether or not other objects are encoded concurrently, in primate object-processing regions, neural responses to an object pair have been shown to be well approximated by the average responses to each constituent object shown alone, indicating the whole is equal to the average of its parts. This is present at the single unit level in the slope of response amplitudes of macaque IT neurons to paired and single objects, and at the population level in response patterns of fMRI voxels in human ventral object processing regions (e.g., LO). Here we show that averaging exists in both single fMRI voxels and voxel population responses in human LO, with better averaging in single voxels leading to better averaging in fMRI response patterns, demonstrating a close correspondence of averaging at the fMRI unit and population levels. To understand if a similar averaging mechanism exists in convolutional neural networks (CNNs) pretrained for object classification, we examined five CNNs with varying architecture, depth and the presence/absence of recurrent processing. We observed averaging at the CNN unit level but rarely at the population level, with CNN unit response distribution in most cases did not resemble human LO or macaque IT responses. The whole is thus not equal to the average of its parts in CNNs, potentially rendering the individual objects in a pair less accessible in CNNs during visual processing than they are in the human brain.

摘要

现实世界中的物体常常与其他物体一同出现。为了识别一个物体的身份，无论其他物体是否同时被编码，在灵长类动物的物体处理区域，对一对物体的神经反应已被证明可以很好地用对单独呈现的每个组成物体的平均反应来近似，这表明整体等于其部分的平均值。这在猕猴颞下（IT）神经元对配对物体和单个物体的反应幅度斜率的单个单元水平上存在，并且在人类腹侧物体处理区域（例如外侧枕叶区（LO））的功能磁共振成像（fMRI）体素的群体水平反应模式中也存在。在这里，我们表明在人类外侧枕叶区的单个fMRI体素和体素群体反应中都存在平均化现象，单个体素中更好的平均化导致fMRI反应模式中更好的平均化，这表明在fMRI单元和群体水平上的平均化具有密切的对应关系。为了了解在为物体分类而预训练的卷积神经网络（CNN）中是否存在类似的平均化机制，我们研究了五个具有不同架构、深度以及是否存在循环处理的卷积神经网络。我们观察到在卷积神经网络单元水平存在平均化现象，但在群体水平很少见，在大多数情况下卷积神经网络单元反应分布与人类外侧枕叶区或猕猴颞下神经元反应不同。因此，在卷积神经网络中整体不等于其部分的平均值，这可能使得在视觉处理过程中，卷积神经网络中一对物体中的单个物体比在人类大脑中更难被识别。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人类大脑和卷积神经网络中对多个视觉对象的表征

Representing Multiple Visual Objects in the Human Brain and Convolutional Neural Networks.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

人类大脑和卷积神经网络中对多个视觉对象的表征

Representing Multiple Visual Objects in the Human Brain and Convolutional Neural Networks.

作者信息

机构信息

出版信息

相似文献

本文引用的文献