一种基于随机元下降的快速且可扩展的递归神经网络。

A fast and scalable recurrent neural network based on stochastic meta descent.

作者信息

Liu Zhenzhen, Elhanany Itamar

机构信息

Electrical Engineering and Computer Science, The University of Tennessee, Knoxville, TN 37996 USA.

出版信息

IEEE Trans Neural Netw. 2008 Sep;19(9):1652-8. doi: 10.1109/TNN.2008.2000838.

DOI:10.1109/TNN.2008.2000838

PMID:18779096

Abstract

This brief presents an efficient and scalable online learning algorithm for recurrent neural networks (RNNs). The approach is based on the real-time recurrent learning (RTRL) algorithm, whereby the sensitivity set of each neuron is reduced to weights associated with either its input or output links. This yields a reduced storage and computational complexity of O(N(2)). Stochastic meta descent (SMD), an adaptive step size scheme for stochastic gradient-descent problems, is employed as means of incorporating curvature information in order to substantially accelerate the learning process. We also introduce a clustered version of our algorithm to further improve its scalability attributes. Despite the dramatic reduction in resource requirements, it is shown through simulation results that the approach outperforms regular RTRL by almost an order of magnitude. Moreover, the scheme lends itself to parallel hardware realization by virtue of the localized property that is inherent to the learning framework.

摘要

本简报提出了一种用于递归神经网络（RNN）的高效且可扩展的在线学习算法。该方法基于实时递归学习（RTRL）算法，通过该算法，每个神经元的敏感度集被简化为与其输入或输出链接相关联的权重。这使得存储和计算复杂度降低至O(N(2))。随机元下降（SMD）是一种用于随机梯度下降问题的自适应步长方案，被用作纳入曲率信息的手段，以便大幅加速学习过程。我们还引入了算法的聚类版本，以进一步改善其可扩展性属性。尽管资源需求大幅减少，但通过仿真结果表明，该方法的性能比常规RTRL高出近一个数量级。此外，由于学习框架固有的局部性特性，该方案便于并行硬件实现。