Suppr超能文献

使用简单循环神经网络实现长短期记忆网络的在线回归性能

Achieving Online Regression Performance of LSTMs With Simple RNNs.

作者信息

Vural N Mert, Ilhan Fatih, Yilmaz Selim F, Ergut Salih, Kozat Suleyman Serdar

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7632-7643. doi: 10.1109/TNNLS.2021.3086029. Epub 2022 Nov 30.

Abstract

Recurrent neural networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, long short-term memory networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, training LSTMs requires considerably longer training time compared to simple RNNs (SRNNs). In this article, we achieve the online regression performance of LSTMs with SRNNs efficiently. To this end, we introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time. We provide strong theoretical analysis to support our experimental results by providing regret bounds on the convergence rate of our algorithm. Through an extensive set of experiments, we verify our theoretical work and demonstrate significant performance improvements of our algorithm with respect to LSTMs and the other state-of-the-art learning models.

摘要

递归神经网络(RNN)因其能够泛化非线性时间依赖性而被广泛用于在线回归。作为一种RNN模型,长短期记忆网络(LSTM)在实践中通常更受青睐,因为这些网络能够学习长期依赖性,同时避免梯度消失问题。然而,由于其参数数量众多,与简单递归神经网络(SRNN)相比,训练LSTM需要更长的训练时间。在本文中,我们有效地实现了SRNN的在线回归性能。为此,我们引入了一种在参数数量上具有线性时间复杂度的一阶训练算法。我们表明,当使用我们的算法训练SRNN时,它们在两到三倍短的训练时间内提供与LSTM非常相似的回归性能。我们通过为算法的收敛速度提供遗憾界来提供强有力的理论分析,以支持我们的实验结果。通过广泛的实验,我们验证了我们的理论工作,并证明了我们的算法相对于LSTM和其他先进学习模型的显著性能提升。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验