基于长短期记忆神经网络的高效在线学习算法

Efficient Online Learning Algorithms Based on LSTM Neural Networks.

作者信息

Ergen Tolga, Kozat Suleyman Serdar

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3772-3783. doi: 10.1109/TNNLS.2017.2741598. Epub 2017 Sep 13.

DOI:10.1109/TNNLS.2017.2741598

PMID:28920911

Abstract

We investigate online nonlinear regression and introduce novel regression structures based on the long short term memory (LSTM) networks. For the introduced structures, we also provide highly efficient and effective online training methods. To train these novel LSTM-based structures, we put the underlying architecture in a state space form and introduce highly efficient and effective particle filtering (PF)-based updates. We also provide stochastic gradient descent and extended Kalman filter-based updates. Our PF-based training method guarantees convergence to the optimal parameter estimation in the mean square error sense provided that we have a sufficient number of particles and satisfy certain technical conditions. More importantly, we achieve this performance with a computational complexity in the order of the first-order gradient-based methods by controlling the number of particles. Since our approach is generic, we also introduce a gated recurrent unit (GRU)-based approach by directly replacing the LSTM architecture with the GRU architecture, where we demonstrate the superiority of our LSTM-based approach in the sequential prediction task via different real life data sets. In addition, the experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets.

摘要

我们研究在线非线性回归，并基于长短期记忆（LSTM）网络引入新颖的回归结构。对于所引入的结构，我们还提供了高效且有效的在线训练方法。为了训练这些基于LSTM的新颖结构，我们将基础架构置于状态空间形式，并引入基于高效粒子滤波（PF）的更新。我们还提供基于随机梯度下降和扩展卡尔曼滤波的更新。我们基于PF的训练方法保证在均方误差意义下收敛到最优参数估计，前提是我们有足够数量的粒子并满足某些技术条件。更重要的是，通过控制粒子数量，我们以基于一阶梯度方法的计算复杂度实现了这一性能。由于我们的方法具有通用性，我们还通过直接用门控循环单元（GRU）架构替换LSTM架构引入了基于GRU的方法，在此我们通过不同的现实生活数据集展示了基于LSTM的方法在序列预测任务中的优越性。此外，实验结果表明，在所引入的算法相对于传统方法在几个不同的基准现实生活数据集上取得了显著的性能提升。

相似文献

Efficient Online Learning Algorithms Based on LSTM Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3772-3783. doi: 10.1109/TNNLS.2017.2741598. Epub 2017 Sep 13.

Online Training of LSTM Networks in Distributed Systems for Variable Length Data Sequences.

IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):5159-5165. doi: 10.1109/TNNLS.2017.2770179. Epub 2017 Dec 7.

Energy-Efficient LSTM Networks for Online Learning.

IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):3114-3126. doi: 10.1109/TNNLS.2019.2935796. Epub 2019 Sep 13.

Unsupervised Anomaly Detection With LSTM Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):3127-3141. doi: 10.1109/TNNLS.2019.2935975. Epub 2019 Sep 13.

Nonuniformly Sampled Data Processing Using LSTM Networks.

IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1452-1461. doi: 10.1109/TNNLS.2018.2869822. Epub 2018 Oct 1.

A generalized LSTM-like training algorithm for second-order recurrent neural networks.

Neural Netw. 2012 Jan;25(1):70-83. doi: 10.1016/j.neunet.2011.07.003. Epub 2011 Jul 18.

Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets.

Neural Netw. 2003 Mar;16(2):241-50. doi: 10.1016/S0893-6080(02)00219-8.

Long short-term memory.

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

Attention mechanism enhanced LSTM with residual architecture and its application for protein-protein interaction residue pairs prediction.

BMC Bioinformatics. 2019 Nov 27;20(1):609. doi: 10.1186/s12859-019-3199-1.

Forecasting stock prices with long-short term memory neural network based on attention mechanism.

PLoS One. 2020 Jan 3;15(1):e0227222. doi: 10.1371/journal.pone.0227222. eCollection 2020.

引用本文的文献

LSTM-Based Virtual Load Sensor for Heavy-Duty Vehicles.

Sensors (Basel). 2023 Dec 30;24(1):226. doi: 10.3390/s24010226.

ξ: An AI-Based Data Analytics Scheme for COVID-19 Prediction and Economy Boosting.

IEEE Internet Things J. 2020 Dec 25;8(21):15977-15989. doi: 10.1109/JIOT.2020.3047539. eCollection 2021 Nov 1.

Design of Fault Prediction System for Electromechanical Sensor Equipment Based on Deep Learning.

Comput Intell Neurosci. 2022 Mar 17;2022:3057167. doi: 10.1155/2022/3057167. eCollection 2022.

Based on improved deep convolutional neural network model pneumonia image classification.

PLoS One. 2021 Nov 4;16(11):e0258804. doi: 10.1371/journal.pone.0258804. eCollection 2021.

Brain wave classification using long short-term memory network based OPTICAL predictor.

Sci Rep. 2019 Jun 24;9(1):9153. doi: 10.1038/s41598-019-45605-1.

A Robust Terrain Aided Navigation Using the Rao-Blackwellized Particle Filter Trained by Long Short-Term Memory Networks.

Sensors (Basel). 2018 Aug 31;18(9):2886. doi: 10.3390/s18092886.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于长短期记忆神经网络的高效在线学习算法

Efficient Online Learning Algorithms Based on LSTM Neural Networks.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献