Laboratoire de Neurosciences Cognitives et Computationelles, INSERM U960, and Laboratoire de Physique Statistique, CNRS UMR 8550, Ecole Normale Supérieure-PSL Research University, Paris 75005, France
Laboratoire de Neurosciences Cognitives et Computationelles, INSERM U960, Ecole Normale Supérieure-PSL Research University, Paris 75005, France
Neural Comput. 2019 Jun;31(6):1139-1182. doi: 10.1162/neco_a_01187. Epub 2019 Apr 12.
Recurrent neural networks have been extensively studied in the context of neuroscience and machine learning due to their ability to implement complex computations. While substantial progress in designing effective learning algorithms has been achieved, a full understanding of trained recurrent networks is still lacking. Specifically, the mechanisms that allow computations to emerge from the underlying recurrent dynamics are largely unknown. Here we focus on a simple yet underexplored computational setup: a feedback architecture trained to associate a stationary output to a stationary input. As a starting point, we derive an approximate analytical description of global dynamics in trained networks, which assumes uncorrelated connectivity weights in the feedback and in the random bulk. The resulting mean-field theory suggests that the task admits several classes of solutions, which imply different stability properties. Different classes are characterized in terms of the geometrical arrangement of the readout with respect to the input vectors, defined in the high-dimensional space spanned by the network population. We find that such an approximate theoretical approach can be used to understand how standard training techniques implement the input-output task in finite-size feedback networks. In particular, our simplified description captures the local and the global stability properties of the target solution, and thus predicts training performance.
递归神经网络在神经科学和机器学习领域得到了广泛的研究,因为它们能够实现复杂的计算。尽管在设计有效的学习算法方面取得了实质性的进展,但对经过训练的递归网络的全面理解仍有待提高。具体来说,允许计算从底层递归动力学中涌现的机制在很大程度上是未知的。在这里,我们关注一个简单但尚未充分探索的计算设置:一个经过训练的反馈架构,用于将固定的输出与固定的输入关联起来。作为一个起点,我们推导出了一个关于训练网络全局动力学的近似解析描述,它假设反馈和随机体中的连接权重是不相关的。所得的平均场理论表明,该任务有几个类别的解决方案,这些解决方案意味着不同的稳定性属性。不同的类别是根据读取相对于输入向量的几何排列来定义的,这些输入向量在网络群体所占据的高维空间中定义。我们发现,这种近似的理论方法可以用来理解标准的训练技术如何在有限大小的反馈网络中实现输入-输出任务。具体来说,我们的简化描述捕捉到了目标解的局部和全局稳定性特性,从而预测了训练性能。