Department of Electrical Engineering, Neurosciences Program, Stanford University, Stanford, CA 94305-9505, USA.
Neural Comput. 2013 Mar;25(3):626-49. doi: 10.1162/NECO_a_00409. Epub 2012 Dec 28.
Recurrent neural networks (RNNs) are useful tools for learning nonlinear relationships between time-varying inputs and outputs with complex temporal dependencies. Recently developed algorithms have been successful at training RNNs to perform a wide variety of tasks, but the resulting networks have been treated as black boxes: their mechanism of operation remains unknown. Here we explore the hypothesis that fixed points, both stable and unstable, and the linearized dynamics around them, can reveal crucial aspects of how RNNs implement their computations. Further, we explore the utility of linearization in areas of phase space that are not true fixed points but merely points of very slow movement. We present a simple optimization technique that is applied to trained RNNs to find the fixed and slow points of their dynamics. Linearization around these slow regions can be used to explore, or reverse-engineer, the behavior of the RNN. We describe the technique, illustrate it using simple examples, and finally showcase it on three high-dimensional RNN examples: a 3-bit flip-flop device, an input-dependent sine wave generator, and a two-point moving average. In all cases, the mechanisms of trained networks could be inferred from the sets of fixed and slow points and the linearized dynamics around them.
递归神经网络 (RNN) 是一种有用的工具,可用于学习具有复杂时间依赖性的时变输入和输出之间的非线性关系。最近开发的算法已成功用于训练 RNN 以执行各种任务,但得到的网络被视为黑盒:其工作机制仍然未知。在这里,我们探讨了这样一个假设,即平衡点(稳定和不稳定的)及其周围的线性化动力学,可以揭示 RNN 如何实现其计算的关键方面。此外,我们还探讨了在线性化空间区域的应用,这些区域不是真正的平衡点,而仅仅是移动非常缓慢的点。我们提出了一种简单的优化技术,该技术应用于训练有素的 RNN 以找到其动力学的固定点和缓慢点。可以围绕这些缓慢区域进行线性化,以探索或反向设计 RNN 的行为。我们描述了该技术,使用简单的示例进行了说明,并最终在三个高维 RNN 示例上展示了该技术:一个 3 位触发器设备、一个依赖于输入的正弦波发生器和一个两点移动平均值。在所有情况下,都可以从固定点和缓慢点及其周围的线性化动力学中推断出经过训练的网络的机制。