Department of Psychology, Vision Sciences Laboratory, Harvard University, Cambridge, MA, United States of America.
Department of Psychology, Yale University, New Haven, CT, United States of America.
PLoS One. 2021 Jun 30;16(6):e0253442. doi: 10.1371/journal.pone.0253442. eCollection 2021.
To interact with real-world objects, any effective visual system must jointly code the unique features defining each object. Despite decades of neuroscience research, we still lack a firm grasp on how the primate brain binds visual features. Here we apply a novel network-based stimulus-rich representational similarity approach to study color and form binding in five convolutional neural networks (CNNs) with varying architecture, depth, and presence/absence of recurrent processing. All CNNs showed near-orthogonal color and form processing in early layers, but increasingly interactive feature coding in higher layers, with this effect being much stronger for networks trained for object classification than untrained networks. These results characterize for the first time how multiple basic visual features are coded together in CNNs. The approach developed here can be easily implemented to characterize whether a similar coding scheme may serve as a viable solution to the binding problem in the primate brain.
为了与真实世界的物体进行交互,任何有效的视觉系统都必须联合编码定义每个物体的独特特征。尽管神经科学研究已经进行了几十年,但我们仍然不清楚灵长类动物大脑如何绑定视觉特征。在这里,我们应用了一种新颖的基于网络的刺激丰富的表示相似性方法,研究了具有不同架构、深度和是否存在递归处理的五个卷积神经网络 (CNN) 中的颜色和形状绑定。所有 CNN 在早期层都表现出近乎正交的颜色和形状处理,但在更高层的特征编码越来越具有交互性,对于经过物体分类训练的网络,这种效果比未经训练的网络要强得多。这些结果首次描述了多个基本视觉特征如何在 CNN 中一起编码。这里开发的方法可以很容易地实现,以确定类似的编码方案是否可以作为灵长类动物大脑中绑定问题的可行解决方案。