Yang WenLu, Zhang LiQing, Ma LiBo
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
Sci China C Life Sci. 2008 Jun;51(6):526-36. doi: 10.1007/s11427-008-0074-0. Epub 2008 May 17.
Perception of objects and motions in the visual scene is one of the basic problems in the visual system. There exist 'What' and 'Where' pathways in the superior visual cortex, starting from the simple cells in the primary visual cortex. The former is able to perceive objects such as forms, color, and texture, and the latter perceives 'where', for example, velocity and direction of spatial movement of objects. This paper explores brain-like computational architectures of visual information processing. We propose a visual perceptual model and computational mechanism for training the perceptual model. The computational model is a three-layer network. The first layer is the input layer which is used to receive the stimuli from natural environments. The second layer is designed for representing the internal neural information. The connections between the first layer and the second layer, called the receptive fields of neurons, are self-adaptively learned based on principle of sparse neural representation. To this end, we introduce Kullback-Leibler divergence as the measure of independence between neural responses and derive the learning algorithm based on minimizing the cost function. The proposed algorithm is applied to train the basis functions, namely receptive fields, which are localized, oriented, and bandpassed. The resultant receptive fields of neurons in the second layer have the characteristics resembling that of simple cells in the primary visual cortex. Based on these basis functions, we further construct the third layer for perception of what and where in the superior visual cortex. The proposed model is able to perceive objects and their motions with a high accuracy and strong robustness against additive noise. Computer simulation results in the final section show the feasibility of the proposed perceptual model and high efficiency of the learning algorithm.
视觉场景中物体和运动的感知是视觉系统中的基本问题之一。从初级视觉皮层的简单细胞开始,在高级视觉皮层中存在“什么”和“哪里”两条通路。前者能够感知诸如形状、颜色和纹理等物体,后者感知“哪里”,例如物体空间运动的速度和方向。本文探索视觉信息处理的类脑计算架构。我们提出了一种视觉感知模型以及训练该感知模型的计算机制。该计算模型是一个三层网络。第一层是输入层,用于接收来自自然环境的刺激。第二层用于表示内部神经信息。第一层和第二层之间的连接,即神经元的感受野,基于稀疏神经表示原理进行自适应学习。为此,我们引入库尔贝克 - 莱布勒散度作为神经反应之间独立性的度量,并基于最小化成本函数推导学习算法。所提出的算法用于训练基函数,即感受野,这些感受野具有局部化、定向和带通的特性。第二层中神经元的最终感受野具有类似于初级视觉皮层中简单细胞的特征。基于这些基函数,我们进一步构建第三层以感知高级视觉皮层中的“什么”和“哪里”。所提出的模型能够高精度且强鲁棒地感知物体及其运动,以抵抗加性噪声。最后一部分的计算机模拟结果表明了所提出的感知模型的可行性以及学习算法 的高效性。