Karklin Yan, Lewicki Michael S
Computer Science Department & Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Nature. 2009 Jan 1;457(7225):83-6. doi: 10.1038/nature07481. Epub 2008 Nov 19.
A fundamental function of the visual system is to encode the building blocks of natural scenes-edges, textures and shapes-that subserve visual tasks such as object recognition and scene understanding. Essential to this process is the formation of abstract representations that generalize from specific instances of visual input. A common view holds that neurons in the early visual system signal conjunctions of image features, but how these produce invariant representations is poorly understood. Here we propose that to generalize over similar images, higher-level visual neurons encode statistical variations that characterize local image regions. We present a model in which neural activity encodes the probability distribution most consistent with a given image. Trained on natural images, the model generalizes by learning a compact set of dictionary elements for image distributions typically encountered in natural scenes. Model neurons show a diverse range of properties observed in cortical cells. These results provide a new functional explanation for nonlinear effects in complex cells and offer insight into coding strategies in primary visual cortex (V1) and higher visual areas.
视觉系统的一个基本功能是对自然场景的构成要素——边缘、纹理和形状进行编码,这些要素有助于诸如物体识别和场景理解等视觉任务。这一过程的关键在于形成从视觉输入的特定实例中进行概括的抽象表征。一种普遍观点认为,早期视觉系统中的神经元会发出图像特征结合的信号,但对于这些特征如何产生不变的表征,人们了解甚少。在此我们提出,为了对相似图像进行概括,更高层次的视觉神经元会对表征局部图像区域特征的统计变化进行编码。我们提出了一个模型,其中神经活动对与给定图像最一致的概率分布进行编码。该模型在自然图像上进行训练,通过学习一组紧凑的字典元素来对自然场景中常见的图像分布进行概括。模型神经元展现出在皮质细胞中观察到的多种特性。这些结果为复杂细胞中的非线性效应提供了一种新的功能解释,并为初级视觉皮层(V1)和更高视觉区域的编码策略提供了见解。