Yang H H, Amari S
Oregon Graduate Institute, Computer Science Dept, Box 91000, Portland OR 97291, USA.
Neural Comput. 1998 Nov 15;10(8):2137-57. doi: 10.1162/089976698300017007.
The natural gradient descent method is applied to train an n-m-1 multilayer perceptron. Based on an efficient scheme to represent the Fisher information matrix for an n-m-1 stochastic multilayer perceptron, a new algorithm is proposed to calculate the natural gradient without inverting the Fisher information matrix explicitly. When the input dimension n is much larger than the number of hidden neurons m, the time complexity of computing the natural gradient is O(n).
采用自然梯度下降法训练一个n-m-1多层感知器。基于一种有效表示n-m-1随机多层感知器的Fisher信息矩阵的方案,提出了一种新算法,无需显式求逆Fisher信息矩阵即可计算自然梯度。当输入维度n远大于隐藏神经元数量m时,计算自然梯度的时间复杂度为O(n)。