Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Motional AD, Inc., Boston, MA 02210, USA.
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Neuron. 2023 Apr 19;111(8):1331-1344.e8. doi: 10.1016/j.neuron.2023.01.023. Epub 2023 Mar 9.
Humans learn internal models of the world that support planning and generalization in complex environments. Yet it remains unclear how such internal models are represented and learned in the brain. We approach this question using theory-based reinforcement learning, a strong form of model-based reinforcement learning in which the model is a kind of intuitive theory. We analyzed fMRI data from human participants learning to play Atari-style games. We found evidence of theory representations in prefrontal cortex and of theory updating in prefrontal cortex, occipital cortex, and fusiform gyrus. Theory updates coincided with transient strengthening of theory representations. Effective connectivity during theory updating suggests that information flows from prefrontal theory-coding regions to posterior theory-updating regions. Together, our results are consistent with a neural architecture in which top-down theory representations originating in prefrontal regions shape sensory predictions in visual areas, where factored theory prediction errors are computed and trigger bottom-up updates of the theory.
人类学习内部的世界模型,以支持在复杂环境中的规划和泛化。然而,目前尚不清楚大脑中是如何表示和学习这些内部模型的。我们使用基于理论的强化学习来解决这个问题,这是一种强有力的基于模型的强化学习形式,其中模型是一种直观的理论。我们分析了人类参与者学习玩 Atari 风格游戏的 fMRI 数据。我们在额皮质和枕叶皮质以及梭状回中发现了理论表示的证据,以及在额皮质、枕叶皮质和梭状回中发现了理论更新的证据。理论更新与理论表示的短暂增强同时发生。理论更新期间的有效连通性表明,信息从额皮质的理论编码区域流向后部的理论更新区域。总的来说,我们的结果与一种神经架构一致,即源自额皮质区域的自上而下的理论表示塑造了视觉区域的感觉预测,在这些区域中计算了因子化的理论预测误差,并触发了理论的自下而上更新。