离散时域递归神经网络的鲁棒自适应梯度下降训练算法

Robust adaptive gradient-descent training algorithm for recurrent neural networks in discrete time domain.

作者信息

Song Qing, Wu Yilei, Soh Yeng Chai

机构信息

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore.

出版信息

IEEE Trans Neural Netw. 2008 Nov;19(11):1841-53. doi: 10.1109/TNN.2008.2001923.

DOI:10.1109/TNN.2008.2001923

PMID:18990640

Abstract

For a recurrent neural network (RNN), its transient response is a critical issue, especially for real-time signal processing applications. The conventional RNN training algorithms, such as backpropagation through time (BPTT) and real-time recurrent learning (RTRL), have not adequately addressed this problem because they suffer from low convergence speed. While increasing the learning rate may help to improve the performance of the RNN, it can result in unstable training in terms of weight divergence. Therefore, an optimal tradeoff between RNN training speed and weight convergence is desired. In this paper, a robust adaptive gradient-descent (RAGD) training algorithm of RNN is developed based on a novel RNN hybrid training concept. It switches the training patterns between standard real-time online backpropagation (BP) and RTRL according to the derived convergence and stability conditions. The weight convergence and L(2)-stability of the algorithm are derived via the conic sector theorem. The optimized adaptive learning maximizes the training speed of the RNN for each weight update without violating the stability and convergence criteria. Computer simulations are carried out to demonstrate the applicability of the theoretical results.

摘要

对于递归神经网络（RNN）而言，其瞬态响应是一个关键问题，特别是在实时信号处理应用中。传统的RNN训练算法，如通过时间反向传播（BPTT）和实时递归学习（RTRL），并未充分解决此问题，因为它们存在收敛速度低的问题。虽然提高学习率可能有助于提升RNN的性能，但这可能导致权重发散方面的训练不稳定。因此，需要在RNN训练速度和权重收敛之间进行最优权衡。本文基于一种新颖的RNN混合训练概念，开发了一种RNN的鲁棒自适应梯度下降（RAGD）训练算法。它根据推导得出的收敛和稳定性条件，在标准实时在线反向传播（BP）和RTRL之间切换训练模式。该算法的权重收敛性和L(2)稳定性通过圆锥扇形定理推导得出。优化后的自适应学习在不违反稳定性和收敛准则的情况下，使每次权重更新时RNN的训练速度最大化。进行了计算机仿真以证明理论结果的适用性。