Erwin E, Obermayer K, Schulten K
Beckman Institute, University of Illinois, Urbana-Champaign 61801.
Biol Cybern. 1992;67(1):47-55. doi: 10.1007/BF00201801.
We investigate the convergence properties of the self-organizing feature map algorithm for a simple, but very instructive case: the formation of a topographic representation of the unit interval [0, 1] by a linear chain of neurons. We extend the proofs of convergence of Kohonen and of Cottrell and Fort to hold in any case where the neighborhood function, which is used to scale the change in the weight values at each neuron, is a monotonically decreasing function of distance from the winner neuron. We prove that the learning dynamics cannot be described by a gradient descent on a single energy function, but may be described using a set of potential functions, one for each neuron, which are independently minimized following a stochastic gradient descent. We derive the correct potential functions for the one- and multi-dimensional case, and show that the energy functions given by Tolat (1990) are an approximation which is no longer valid in the case of highly disordered maps or steep neighborhood functions.
我们针对一个简单但极具启发性的情形,研究自组织特征映射算法的收敛特性:由神经元线性链形成单位区间[0, 1]的拓扑表示。我们扩展了科霍宁以及科特雷尔和福特的收敛性证明,使其适用于任何情形,即用于缩放每个神经元权重值变化的邻域函数是距获胜神经元距离的单调递减函数。我们证明学习动态不能用单个能量函数上的梯度下降来描述,而是可以用一组势函数来描述,每个神经元对应一个势函数,这些势函数通过随机梯度下降独立地最小化。我们推导了一维和多维情形下的正确势函数,并表明托拉特(1990年)给出的能量函数是一种近似,在高度无序映射或陡峭邻域函数的情况下不再有效。