大脑如何在后颞叶皮层中快速学习和重新组织不变视图和不变位置的物体表示？

How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex?

机构信息

Center for Adaptive Systems, Department of Cognitive and Neural Systems, Center of Excellence for Learning in Education, Science, and Technology, Boston University, 677 Beacon Street, Boston, MA 02215, USA

出版信息

Neural Netw. 2011 Dec;24(10):1050-61. doi: 10.1016/j.neunet.2011.04.004. Epub 2011 Apr 22.

DOI:10.1016/j.neunet.2011.04.004

PMID:21596523

Abstract

All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans.

摘要

所有灵长类动物的生存都依赖于它们能够快速学习和识别物体。物体可以在多个位置、大小和视角被视觉检测到。当大脑通过眼球运动扫描场景时，它如何在不导致所需细胞数量组合爆炸的情况下快速学习和识别物体？大脑如何避免在视觉场景中相同或不同位置将不同物体的部分错误地分类在一起的问题？在猴子和人类中，用于这种不变的物体类别学习和识别的关键区域是下颞叶皮层（IT）。提出了一个神经模型来解释空间和物体注意力如何协调 IT 学习在多个位置、大小和视角下看到的物体的不变类别表示的能力。该模型阐明了视觉大脑中处理阶段的层次结构内的相互作用如何实现这一点。这些阶段包括视网膜、外侧膝状体核以及大脑中的 V1、V2、V4 和 IT 等皮层区域的 What 皮层流，以及它们与位于 Where 皮层流中的顶叶皮层内的空间注意过程的相互作用。该模型建立在 ARTSCAN 模型的基础上，该模型提出了如何生成视图不变的物体表示。位置 ARTSCAN（pARTSCAN）模型提出了以下额外的过程在 What 皮层处理流中也能够学习位置不变的物体表示：具有持久活动的 IT 细胞，以及归一化物体类别竞争和视图到物体学习规律的组合，它们共同确保清晰的视图对物体识别的影响大于模糊的视图。该模型解释了当猴子或其他灵长类动物在眼球运动期间被呈现一个与另一个物体交换的物体时，这种不变的学习如何被愚弄。交换过程被预测会阻止空间注意力重置，否则这会导致多个物体的表示通过学习结合在一起。Li 和 DiCarlo（2008）提供了猴子的神经生理学数据，表明在目标交换实验中，无监督的自然经验如何迅速改变 IT 中的物体表示。该模型通过显示交换过程如何愚弄空间注意力机制，定量模拟了交换数据。更一般地，该模型为理解猴子的神经生理学方法中的物体学习数据以及人类的空间注意力、情景学习和记忆检索数据提供了一个统一的框架和可测试的预测。

相似文献

How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex?

Neural Netw. 2011 Dec;24(10):1050-61. doi: 10.1016/j.neunet.2011.04.004. Epub 2011 Apr 22.

View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds.

Cogn Psychol. 2009 Feb;58(1):1-48. doi: 10.1016/j.cogpsych.2008.05.001. Epub 2008 Jul 23.

On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning.

Neural Netw. 2011 Dec;24(10):1036-49. doi: 10.1016/j.neunet.2011.04.001. Epub 2011 Apr 22.

Spatial scene representations formed by self-organizing learning in a hippocampal extension of the ventral visual system.

Eur J Neurosci. 2008 Nov;28(10):2116-27. doi: 10.1111/j.1460-9568.2008.06486.x.

Invariant visual object recognition: a model, with lighting invariance.

J Physiol Paris. 2006 Jul-Sep;100(1-3):43-62. doi: 10.1016/j.jphysparis.2006.09.004. Epub 2006 Oct 30.

Cortical dynamics of contextually cued attentive visual learning and search: spatial and object evidence accumulation.

Psychol Rev. 2010 Oct;117(4):1080-112. doi: 10.1037/a0020664.

Invariant object recognition in the visual system with novel views of 3D objects.

Neural Comput. 2002 Nov;14(11):2585-96. doi: 10.1162/089976602760407982.

Invariant object recognition with trace learning and multiple stimuli present during training.

Network. 2007 Jun;18(2):161-87. doi: 10.1080/09548980701556055.

A neural model of the temporal dynamics of figure-ground segregation in motion perception.

Neural Netw. 2010 Mar;23(2):160-76. doi: 10.1016/j.neunet.2009.10.005. Epub 2009 Oct 30.

Where's Waldo? How perceptual, cognitive, and emotional brain processes cooperate during learning to categorize and find desired objects in a cluttered scene.

Front Integr Neurosci. 2014 Jun 17;8:43. doi: 10.3389/fnint.2014.00043. eCollection 2014.

引用本文的文献

Neural network models of autonomous adaptive intelligence and artificial general intelligence: how our brains learn large language models and their meanings.

Front Syst Neurosci. 2025 Jul 30;19:1630151. doi: 10.3389/fnsys.2025.1630151. eCollection 2025.

How children learn to understand language meanings: a neural model of adult-child multimodal interactions in real-time.

Front Psychol. 2023 Aug 3;14:1216479. doi: 10.3389/fpsyg.2023.1216479. eCollection 2023.

A Neural Model of Intrinsic and Extrinsic Hippocampal Theta Rhythms: Anatomy, Neurophysiology, and Function.

Front Syst Neurosci. 2021 Apr 28;15:665052. doi: 10.3389/fnsys.2021.665052. eCollection 2021.

A Canonical Laminar Neocortical Circuit Whose Bottom-Up, Horizontal, and Top-Down Pathways Control Attention, Learning, and Prediction.

Front Syst Neurosci. 2021 Apr 23;15:650263. doi: 10.3389/fnsys.2021.650263. eCollection 2021.

A Path Toward Explainable AI and Autonomous Adaptive Intelligence: Deep Learning, Adaptive Resonance, and Models of Perception, Emotion, and Action.

Front Neurorobot. 2020 Jun 25;14:36. doi: 10.3389/fnbot.2020.00036. eCollection 2020.

Desirability, availability, credit assignment, category learning, and attention: Cognitive-emotional and working memory dynamics of orbitofrontal, ventrolateral, and dorsolateral prefrontal cortices.

Brain Neurosci Adv. 2018 May 8;2:2398212818772179. doi: 10.1177/2398212818772179. eCollection 2018 Jan-Dec.

The Embodied Brain of SOVEREIGN2: From Space-Variant Conscious Percepts During Visual Search and Navigation to Learning Invariant Object Categories and Cognitive-Emotional Plans for Acquiring Valued Goals.

Front Comput Neurosci. 2019 Jun 25;13:36. doi: 10.3389/fncom.2019.00036. eCollection 2019.

The resonant brain: How attentive conscious seeing regulates action sequences that interact with attentive cognitive learning, recognition, and prediction.

Atten Percept Psychophys. 2019 Oct;81(7):2237-2264. doi: 10.3758/s13414-019-01789-2.

Neural Dynamics of Autistic Repetitive Behaviors and Fragile X Syndrome: Basal Ganglia Movement Gating and mGluR-Modulated Adaptively Timed Learning.

Front Psychol. 2018 Mar 13;9:269. doi: 10.3389/fpsyg.2018.00269. eCollection 2018.

Neural Computation of Surface Border Ownership and Relative Surface Depth from Ambiguous Contrast Inputs.

Front Psychol. 2016 Jul 28;7:1102. doi: 10.3389/fpsyg.2016.01102. eCollection 2016.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

大脑如何在后颞叶皮层中快速学习和重新组织不变视图和不变位置的物体表示？

How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex?

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献