ISL, Department of Electrical Engineering, Stanford University, CA, USA.
Neural Netw. 2013 Jan;37:182-8. doi: 10.1016/j.neunet.2012.09.020. Epub 2012 Oct 15.
A new learning algorithm for multilayer neural networks that we have named No-Propagation (No-Prop) is hereby introduced. With this algorithm, the weights of the hidden-layer neurons are set and fixed with random values. Only the weights of the output-layer neurons are trained, using steepest descent to minimize mean square error, with the LMS algorithm of Widrow and Hoff. The purpose of introducing nonlinearity with the hidden layers is examined from the point of view of Least Mean Square Error Capacity (LMS Capacity), which is defined as the maximum number of distinct patterns that can be trained into the network with zero error. This is shown to be equal to the number of weights of each of the output-layer neurons. The No-Prop algorithm and the Back-Prop algorithm are compared. Our experience with No-Prop is limited, but from the several examples presented here, it seems that the performance regarding training and generalization of both algorithms is essentially the same when the number of training patterns is less than or equal to LMS Capacity. When the number of training patterns exceeds Capacity, Back-Prop is generally the better performer. But equivalent performance can be obtained with No-Prop by increasing the network Capacity by increasing the number of neurons in the hidden layer that drives the output layer. The No-Prop algorithm is much simpler and easier to implement than Back-Prop. Also, it converges much faster. It is too early to definitively say where to use one or the other of these algorithms. This is still a work in progress.
我们提出了一种新的多层神经网络学习算法,名为无传播(No-Prop)。在这个算法中,隐藏层神经元的权重被设置为随机值并固定。只有输出层神经元的权重使用最陡下降法进行训练,以最小化均方误差,使用 Widrow 和 Hoff 的 LMS 算法。从最小均方误差容量(LMS Capacity)的角度来看,引入隐藏层非线性的目的是检查可以用零误差训练到网络中的不同模式的最大数量。这被证明等于每个输出层神经元的权重数量。比较了 No-Prop 算法和 Back-Prop 算法。我们的 No-Prop 经验有限,但从这里呈现的几个例子来看,当训练模式的数量小于或等于 LMS Capacity 时,这两个算法的训练和泛化性能似乎基本相同。当训练模式的数量超过容量时,Back-Prop 通常是更好的执行者。但是,通过增加驱动输出层的隐藏层中的神经元数量来增加网络容量,也可以使用 No-Prop 获得等效性能。No-Prop 算法比 Back-Prop 简单得多,易于实现。此外,它的收敛速度也快得多。现在还不能确定在何处使用这两种算法中的一种。这仍然是一个正在进行的工作。