Cox Patrick H, Riesenhuber Maximilian
Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007.
Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
J Neurosci. 2015 Oct 21;35(42):14148-59. doi: 10.1523/JNEUROSCI.1211-15.2015.
The ability to recognize objects in clutter is crucial for human vision, yet the underlying neural computations remain poorly understood. Previous single-unit electrophysiology recordings in inferotemporal cortex in monkeys and fMRI studies of object-selective cortex in humans have shown that the responses to pairs of objects can sometimes be well described as a weighted average of the responses to the constituent objects. Yet, from a computational standpoint, it is not clear how the challenge of object recognition in clutter can be solved if downstream areas must disentangle the identity of an unknown number of individual objects from the confounded average neuronal responses. An alternative idea is that recognition is based on a subpopulation of neurons that are robust to clutter, i.e., that do not show response averaging, but rather robust object-selective responses in the presence of clutter. Here we show that simulations using the HMAX model of object recognition in cortex can fit the aforementioned single-unit and fMRI data, showing that the averaging-like responses can be understood as the result of responses of object-selective neurons to suboptimal stimuli. Moreover, the model shows how object recognition can be achieved by a sparse readout of neurons whose selectivity is robust to clutter. Finally, the model provides a novel prediction about human object recognition performance, namely, that target recognition ability should show a U-shaped dependency on the similarity of simultaneously presented clutter objects. This prediction is confirmed experimentally, supporting a simple, unifying model of how the brain performs object recognition in clutter.
The neural mechanisms underlying object recognition in cluttered scenes (i.e., containing more than one object) remain poorly understood. Studies have suggested that neural responses to multiple objects correspond to an average of the responses to the constituent objects. Yet, it is unclear how the identities of an unknown number of objects could be disentangled from a confounded average response. Here, we use a popular computational biological vision model to show that averaging-like responses can result from responses of clutter-tolerant neurons to suboptimal stimuli. The model also provides a novel prediction, that human detection ability should show a U-shaped dependency on target-clutter similarity, which is confirmed experimentally, supporting a simple, unifying account of how the brain performs object recognition in clutter.
在杂乱环境中识别物体的能力对人类视觉至关重要,但其潜在的神经计算仍知之甚少。先前在猴子颞下皮质进行的单神经元电生理记录以及对人类物体选择性皮质的功能磁共振成像研究表明,对成对物体的反应有时可以很好地描述为对组成物体反应的加权平均值。然而,从计算的角度来看,如果下游区域必须从混淆的平均神经元反应中解开未知数量的单个物体的身份,那么尚不清楚如何解决杂乱环境中物体识别的挑战。另一种观点是,识别基于对杂乱具有鲁棒性的神经元亚群,即,在存在杂乱的情况下不显示反应平均,而是显示鲁棒的物体选择性反应。在这里,我们表明,使用皮质中物体识别的HMAX模型进行的模拟可以拟合上述单神经元和功能磁共振成像数据,表明类似平均的反应可以理解为物体选择性神经元对次优刺激反应的结果。此外,该模型展示了如何通过对选择性对杂乱具有鲁棒性的神经元进行稀疏读出实现物体识别。最后,该模型对人类物体识别性能做出了一个新的预测,即目标识别能力应该对同时呈现的杂乱物体的相似度呈现U形依赖。这一预测得到了实验证实,支持了一个关于大脑如何在杂乱环境中进行物体识别的简单统一模型。
杂乱场景(即包含多个物体)中物体识别的神经机制仍知之甚少。研究表明,对多个物体的神经反应对应于对组成物体反应的平均值。然而,尚不清楚如何从混淆的平均反应中解开未知数量物体的身份。在这里,我们使用一个流行的计算生物视觉模型来表明,类似平均的反应可能是耐杂乱神经元对次优刺激反应的结果。该模型还提供了一个新的预测,即人类检测能力应该对目标 - 杂乱相似度呈现U形依赖,这一预测得到了实验证实,支持了一个关于大脑如何在杂乱环境中进行物体识别的简单统一解释。