Suppr超能文献

基于长短期记忆神经网络的高效在线学习算法

Efficient Online Learning Algorithms Based on LSTM Neural Networks.

作者信息

Ergen Tolga, Kozat Suleyman Serdar

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3772-3783. doi: 10.1109/TNNLS.2017.2741598. Epub 2017 Sep 13.

Abstract

We investigate online nonlinear regression and introduce novel regression structures based on the long short term memory (LSTM) networks. For the introduced structures, we also provide highly efficient and effective online training methods. To train these novel LSTM-based structures, we put the underlying architecture in a state space form and introduce highly efficient and effective particle filtering (PF)-based updates. We also provide stochastic gradient descent and extended Kalman filter-based updates. Our PF-based training method guarantees convergence to the optimal parameter estimation in the mean square error sense provided that we have a sufficient number of particles and satisfy certain technical conditions. More importantly, we achieve this performance with a computational complexity in the order of the first-order gradient-based methods by controlling the number of particles. Since our approach is generic, we also introduce a gated recurrent unit (GRU)-based approach by directly replacing the LSTM architecture with the GRU architecture, where we demonstrate the superiority of our LSTM-based approach in the sequential prediction task via different real life data sets. In addition, the experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets.

摘要

我们研究在线非线性回归,并基于长短期记忆(LSTM)网络引入新颖的回归结构。对于所引入的结构,我们还提供了高效且有效的在线训练方法。为了训练这些基于LSTM的新颖结构,我们将基础架构置于状态空间形式,并引入基于高效粒子滤波(PF)的更新。我们还提供基于随机梯度下降和扩展卡尔曼滤波的更新。我们基于PF的训练方法保证在均方误差意义下收敛到最优参数估计,前提是我们有足够数量的粒子并满足某些技术条件。更重要的是,通过控制粒子数量,我们以基于一阶梯度方法的计算复杂度实现了这一性能。由于我们的方法具有通用性,我们还通过直接用门控循环单元(GRU)架构替换LSTM架构引入了基于GRU的方法,在此我们通过不同的现实生活数据集展示了基于LSTM的方法在序列预测任务中的优越性。此外,实验结果表明,在所引入的算法相对于传统方法在几个不同的基准现实生活数据集上取得了显著的性能提升。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验