关于埃尔曼网络的权重收敛性。
On the weight convergence of Elman networks.
作者信息
Song Qing
机构信息
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore.
出版信息
IEEE Trans Neural Netw. 2010 Mar;21(3):463-80. doi: 10.1109/TNN.2009.2039226. Epub 2010 Feb 2.
An Elman network (EN) can be viewed as a feedforward (FF) neural network with an additional set of inputs from the context layer (feedback from the hidden layer). Therefore, instead of the offline backpropagation-through-time (BPTT) algorithm, a standard online (real-time) backpropagation (BP) algorithm, usually called Elman BP (EBP), can be applied for EN training for discrete-time sequence predictions. However, the standard BP training algorithm is not the most suitable for ENs. A low learning rate can improve the training of ENs but can also result in very slow convergence speeds and poor generalization performance, whereas a high learning rate can lead to unstable training in terms of weight divergence. Therefore, an optimal or suboptimal tradeoff between training speed and weight convergence with good generalization capability is desired for ENs. This paper develops a robust extended EBP (eEBP) training algorithm for ENs with a new adaptive dead zone scheme based on eEBP training concepts. The adaptive learning rate and adaptive dead zone optimize the training of ENs for each individual output and improve the generalization performance of the eEBP training. In particular, for the proposed eEBP training algorithm, convergence of the ENs' weights with the adaptive dead zone estimates is proven in the sense of Lyapunov functions. Computer simulations are carried out to demonstrate the improved performance of eEBP for discrete-time sequence predictions.
埃尔曼网络(EN)可以看作是一个前馈(FF)神经网络,它有一组来自上下文层的额外输入(隐藏层的反馈)。因此,对于EN用于离散时间序列预测的训练,可以应用标准的在线(实时)反向传播(BP)算法,而不是离线的随时间反向传播(BPTT)算法,该算法通常称为埃尔曼BP(EBP)。然而,标准的BP训练算法并非最适合EN。低学习率可以改善EN的训练,但也可能导致收敛速度非常慢和泛化性能较差,而高学习率可能导致在权重发散方面训练不稳定。因此,对于EN来说,需要在训练速度和权重收敛之间进行最优或次优的权衡,并具有良好的泛化能力。本文基于eEBP训练概念,开发了一种具有新的自适应死区方案的EN鲁棒扩展EBP(eEBP)训练算法。自适应学习率和自适应死区针对每个单独的输出优化EN的训练,并提高eEBP训练的泛化性能。特别是,对于所提出的eEBP训练算法,在李雅普诺夫函数意义上证明了具有自适应死区估计的EN权重的收敛性。进行了计算机仿真以证明eEBP在离散时间序列预测方面的性能提升。