Mirikitani Derrick T, Nikolaev Nikolay
Department of Computing, Goldsmiths College, University of London, London, UK.
IEEE Trans Neural Netw. 2010 Feb;21(2):262-74. doi: 10.1109/TNN.2009.2036174. Epub 2009 Dec 28.
This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg-Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.
本文提出了一种概率方法,用于递归神经网络(RNN)的二阶递归训练,以改进时间序列建模。推导了一种通用的递归贝叶斯列文伯格-马夸尔特算法,用于顺序更新权重和协方差(海森)矩阵。该方法的主要优点是对正则化超参数进行了有原则的处理,从而实现了更好的泛化能力,并且具有稳定的数值性能。该框架涉及噪声超参数和局部权重先验超参数的自适应调整,这些超参数分别表示数据中的噪声和模型参数中的不确定性。使用人工和真实数据集进行的实验研究表明,配备该方法的RNN在时间序列建模方面优于递归网络的标准实时递归学习和扩展卡尔曼训练算法,以及其他当代非线性神经模型。