Bartlett M S, Sejnowski T J
University of California San Diego, Department of Cognitive Science, Salk Institute, La Jolla 92037, USA.
Network. 1998 Aug;9(3):399-417.
In natural visual experience, different views of an object or face tend to appear in close temporal proximity as an animal manipulates the object or navigates around it, or as a face changes expression or pose. A set of simulations is presented which demonstrate how viewpoint-invariant representations of faces can be developed from visual experience by capturing the temporal relationships among the input patterns. The simulations explored the interaction of temporal smoothing of activity signals with Hebbian learning in both a feedforward layer and a second, recurrent layer of a network. The feedforward connections were trained by competitive Hebbian learning with temporal smoothing of the post-synaptic unit activities. The recurrent layer was a generalization of a Hopfield network with a low-pass temporal filter on all unit activities. The combination of basic Hebbian learning with temporal smoothing of unit activities produced an attractor network learning rule that associated temporally proximal input patterns into basins of attraction. These two mechanisms were demonstrated in a model that took grey-level images of faces as input. Following training on image sequences of faces as they changed pose, multiple views of a given face fell into the same basin of attraction, and the system acquired representations of faces that were approximately viewpoint-invariant.
在自然视觉体验中,当动物操纵物体或围绕其移动时,或者当面部改变表情或姿势时,物体或面部的不同视图往往会在时间上紧密相邻地出现。本文展示了一组模拟实验,这些实验演示了如何通过捕捉输入模式之间的时间关系,从视觉体验中开发出面部的视角不变表示。这些模拟实验探索了活动信号的时间平滑与前馈层和网络的第二个循环层中的赫布学习之间的相互作用。前馈连接通过竞争赫布学习进行训练,对突触后单元活动进行时间平滑处理。循环层是霍普菲尔德网络的一种推广,对所有单元活动都有一个低通时间滤波器。基本赫布学习与单元活动的时间平滑相结合,产生了一种吸引子网络学习规则,该规则将时间上相邻的输入模式关联到吸引子盆地中。这两种机制在一个以面部灰度图像为输入的模型中得到了验证。在对不同姿势下的面部图像序列进行训练后,给定面部的多个视图落入同一个吸引子盆地,并且系统获得了近似视角不变的面部表示。