Costacurta Julia C, Bhandarkar Shaunak, Zoltowski David M, Linderman Scott W
Wu Tsai Neurosciences Institute, Stanford, CA, USA.
Department of Electrical Engineering, Stanford, CA, USA.
bioRxiv. 2024 Jul 26:2024.07.26.605315. doi: 10.1101/2024.07.26.605315.
The goal of theoretical neuroscience is to develop models that help us better understand biological intelligence. Such models range broadly in complexity and biological detail. For example, task-optimized recurrent neural networks (RNNs) have generated hypotheses about how the brain may perform various computations, but these models typically assume a fixed weight matrix representing the synaptic connectivity between neurons. From decades of neuroscience research, we know that synaptic weights are constantly changing, controlled in part by chemicals such as neuromodulators. In this work we explore the computational implications of synaptic gain scaling, a form of neuromodulation, using task-optimized low-rank RNNs. In our neuromodulated RNN (NM-RNN) model, a neuromodulatory subnetwork outputs a low-dimensional neuromodulatory signal that dynamically scales the low-rank recurrent weights of an output-generating RNN. In empirical experiments, we find that the structured flexibility in the NM-RNN allows it to both train and generalize with a higher degree of accuracy than low-rank RNNs on a set of canonical tasks. Additionally, via theoretical analyses we show how neuromodulatory gain scaling endows networks with gating mechanisms commonly found in artificial RNNs. We end by analyzing the low-rank dynamics of trained NM-RNNs, to show how task computations are distributed.
理论神经科学的目标是开发有助于我们更好地理解生物智能的模型。这类模型在复杂性和生物学细节方面差异很大。例如,任务优化的循环神经网络(RNN)已经产生了关于大脑如何执行各种计算的假设,但这些模型通常假设存在一个固定的权重矩阵来表示神经元之间的突触连接。从数十年的神经科学研究中我们了解到,突触权重在不断变化,部分受神经调质等化学物质的控制。在这项工作中,我们使用任务优化的低秩RNN探索突触增益缩放(一种神经调制形式)的计算含义。在我们的神经调制RNN(NM - RNN)模型中,一个神经调制子网输出一个低维神经调制信号,该信号动态缩放生成输出的RNN的低秩循环权重。在实证实验中,我们发现NM - RNN中的结构化灵活性使其在一组标准任务上比低秩RNN具有更高的训练和泛化精度。此外,通过理论分析,我们展示了神经调制增益缩放如何赋予网络人工RNN中常见的门控机制。我们通过分析训练后的NM - RNN的低秩动态来结束本文,以展示任务计算是如何分布的。