Lin Tsungnan, Horne Bill G., Giles C Lee
EPSON Palo Alto Laboratory, Palo Alto, USA
Neural Netw. 1998 Jul;11(5):861-868. doi: 10.1016/s0893-6080(98)00018-5.
Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform much better than conventional recurrent neural networks for learning certain simple long-term dependency problems. The intuitive explanation for this behavior is that the output memories of a NARX network can be manifested as jump-ahead connections in the time-unfolded network. These jump-ahead connections can propagate gradient information more efficiently, thus reducing the sensitivity of the network to long-term dependencies. This work gives empirical justification to our hypothesis that similar improvements in learning long-term dependencies can be achieved with other classes of recurrent neural network axchitectures simply by increasing the order of the embedded memory. In particular we explore the impact of learning simple long-term dependency problems on three classes of recurrent neural network architectures: globally recurrent networks, locally recurrent networks, and NARX (output feedback) networks.Comparing the performance of these architectures with different orders of embedded memory on two simple long-term dependencies problems shows that all of these classes of network architectures demonstrate significant improvement on learning long-term dependencies when the orders of embedded memory are increased. These results can be important to a user comfortable with a specific recurrent neural network architecture because simply increasing the embedding memory order of that architecture will make it more robust to the problem of long-term dependency learning.
使用递归神经网络学习长期时间依赖性可能是一个难题。最近有研究表明,一类称为NARX网络的递归神经网络在学习某些简单的长期依赖性问题时,比传统的递归神经网络表现要好得多。对这种行为的直观解释是,NARX网络的输出记忆可以在时间展开网络中表现为超前连接。这些超前连接可以更有效地传播梯度信息,从而降低网络对长期依赖性的敏感性。这项工作为我们的假设提供了实证依据,即通过增加嵌入式记忆的阶数,其他类别的递归神经网络架构也可以在学习长期依赖性方面实现类似的改进。特别是,我们探讨了学习简单的长期依赖性问题对三类递归神经网络架构的影响:全局递归网络、局部递归网络和NARX(输出反馈)网络。在两个简单的长期依赖性问题上比较这些具有不同嵌入式记忆阶数的架构的性能表明,当嵌入式记忆的阶数增加时,所有这些类别的网络架构在学习长期依赖性方面都表现出显著的改进。这些结果对于熟悉特定递归神经网络架构的用户可能很重要,因为简单地增加该架构的嵌入式记忆阶数将使其在长期依赖性学习问题上更具鲁棒性。