University of Washington Center for Computational Neuroscience and Swartz Center for Theoretical Neuroscience, Seattle, WA, USA.
Department of Applied Mathematics, University of Washington, Seattle, WA, USA.
Nat Commun. 2021 Mar 3;12(1):1417. doi: 10.1038/s41467-021-21696-1.
Artificial neural networks have recently achieved many successes in solving sequential processing and planning tasks. Their success is often ascribed to the emergence of the task's low-dimensional latent structure in the network activity - i.e., in the learned neural representations. Here, we investigate the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure, possibly reflecting an underlying semantic organization, is through learning to predict observations about the world. Specifically, we ask whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables. Using a recurrent neural network model trained to predict a sequence of observations we show that network dynamics exhibit low-dimensional but nonlinearly transformed representations of sensory inputs that map the latent structure of the sensory environment. We quantify these results using nonlinear measures of intrinsic dimensionality and linear decodability of latent variables, and provide mathematical arguments for why such useful predictive representations emerge. We focus throughout on how our results can aid the analysis and interpretation of experimental data.
人工神经网络最近在解决顺序处理和规划任务方面取得了许多成功。它们的成功通常归因于网络活动中出现的任务的低维潜在结构 - 即,在学习到的神经表示中。在这里,我们研究了这样一个假设,即生成具有易于访问的低维潜在结构的表示的一种方法可能是通过学习预测有关世界的观察结果。具体来说,我们问的是网络机制是否以及何时用于预测感官,同时用于提取潜在变量。我们使用经过训练以预测一系列观察结果的循环神经网络模型来证明,网络动态表现出对感官输入的低维但非线性变换的表示形式,这些表示形式映射了感官环境的潜在结构。我们使用潜在变量的内在维度的非线性度量和线性可解码性来量化这些结果,并提供了为什么会出现这种有用的预测表示的数学论据。我们在整个过程中都关注我们的结果如何帮助分析和解释实验数据。