Suppr超能文献

二阶递归神经网络的广义 LSTM 样训练算法。

A generalized LSTM-like training algorithm for second-order recurrent neural networks.

机构信息

Department of Computer Science, University of Maryland, College Park, MD 20742, USA.

出版信息

Neural Netw. 2012 Jan;25(1):70-83. doi: 10.1016/j.neunet.2011.07.003. Epub 2011 Jul 18.

Abstract

The long short term memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM's original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting its applicability to a small set of network architectures. Here we introduce the generalized long short-term memory(LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks.

摘要

长短期记忆网络(LSTM)是一种二阶递归神经网络架构,擅长存储顺序短期记忆,并在许多时间步之后检索它们。LSTM 的原始训练算法提供了空间和时间局部性的重要特性,而其他训练方法则缺乏这些特性,这限制了其在一小部分网络架构中的适用性。在这里,我们引入了广义长短期记忆(LSTM-g)训练算法,该算法提供了类似于 LSTM 的局部性,同时无需修改即可应用于更广泛的二阶网络架构。在 LSTM-g 中,所有单元的激活和学习都具有相同的一组操作指令,仅受网络中局部环境配置的限制;这与原始 LSTM 训练算法形成对比,在原始 LSTM 训练算法中,每种类型的单元都有自己的激活和训练指令。当应用于具有窥视孔连接的 LSTM 架构时,LSTM-g 利用了反向传播误差的额外来源,这可以比原始算法实现更好的性能。得益于 LSTM-g 的广泛架构适用性,我们证明了针对特定任务设计的递归网络的训练可以产生比单层网络更好的结果。我们得出结论,LSTM-g 有可能提高基于梯度的空间和时间局部性训练算法对递归神经网络的性能和适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/52b8/3217173/2e4bc8d62794/nihms-312462-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验