Fermin Alan S R, Yoshida Takehiko, Yoshimoto Junichiro, Ito Makoto, Tanaka Saori C, Doya Kenji
Graduate School of Information Science, Nara Institute of Science and Technology, Nara 630-0192, Japan.
Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa 904-0495, Japan.
Sci Rep. 2016 Aug 19;6:31378. doi: 10.1038/srep31378.
Humans can select actions by learning, planning, or retrieving motor memories. Reinforcement Learning (RL) associates these processes with three major classes of strategies for action selection: exploratory RL learns state-action values by exploration, model-based RL uses internal models to simulate future states reached by hypothetical actions, and motor-memory RL selects past successful state-action mapping. In order to investigate the neural substrates that implement these strategies, we conducted a functional magnetic resonance imaging (fMRI) experiment while humans performed a sequential action selection task under conditions that promoted the use of a specific RL strategy. The ventromedial prefrontal cortex and ventral striatum increased activity in the exploratory condition; the dorsolateral prefrontal cortex, dorsomedial striatum, and lateral cerebellum in the model-based condition; and the supplementary motor area, putamen, and anterior cerebellum in the motor-memory condition. These findings suggest that a distinct prefrontal-basal ganglia and cerebellar network implements the model-based RL action selection strategy.
人类可以通过学习、规划或检索运动记忆来选择行动。强化学习(RL)将这些过程与行动选择的三大类策略联系起来:探索性强化学习通过探索来学习状态-行动值,基于模型的强化学习使用内部模型来模拟假设行动所达到的未来状态,而运动记忆强化学习则选择过去成功的状态-行动映射。为了研究实施这些策略的神经基础,我们进行了一项功能磁共振成像(fMRI)实验,实验中人类在促进使用特定强化学习策略的条件下执行连续行动选择任务。在探索条件下,腹内侧前额叶皮层和腹侧纹状体的活动增加;在基于模型的条件下,背外侧前额叶皮层、背内侧纹状体和外侧小脑的活动增加;在运动记忆条件下,辅助运动区、壳核和前小脑的活动增加。这些发现表明,一个独特的前额叶-基底神经节和小脑网络实施基于模型的强化学习行动选择策略。