Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
University of Amsterdam, Swammerdam Institute for Life Sciences-Center for Neuroscience Amsterdam, The Netherlands.
PLoS Comput Biol. 2018 Sep 17;14(9):e1006316. doi: 10.1371/journal.pcbi.1006316. eCollection 2018 Sep.
While the neurobiology of simple and habitual choices is relatively well known, our current understanding of goal-directed choices and planning in the brain is still limited. Theoretical work suggests that goal-directed computations can be productively associated to model-based (reinforcement learning) computations, yet a detailed mapping between computational processes and neuronal circuits remains to be fully established. Here we report a computational analysis that aligns Bayesian nonparametrics and model-based reinforcement learning (MB-RL) to the functioning of the hippocampus (HC) and the ventral striatum (vStr)-a neuronal circuit that increasingly recognized to be an appropriate model system to understand goal-directed (spatial) decisions and planning mechanisms in the brain. We test the MB-RL agent in a contextual conditioning task that depends on intact hippocampus and ventral striatal (shell) function and show that it solves the task while showing key behavioral and neuronal signatures of the HC-vStr circuit. Our simulations also explore the benefits of biological forms of look-ahead prediction (forward sweeps) during both learning and control. This article thus contributes to fill the gap between our current understanding of computational algorithms and biological realizations of (model-based) reinforcement learning.
虽然简单和习惯性选择的神经生物学相对为人所知,但我们目前对大脑中目标导向选择和规划的理解仍然有限。理论工作表明,目标导向的计算可以与基于模型的(强化学习)计算进行有益的关联,但计算过程和神经元回路之间的详细映射仍有待完全建立。在这里,我们报告了一项计算分析,该分析将贝叶斯非参数和基于模型的强化学习(MB-RL)与海马体(HC)和腹侧纹状体(vStr)的功能联系起来 - 越来越多的神经元回路被认为是一个合适的模型系统,用于理解大脑中的目标导向(空间)决策和规划机制。我们在一个依赖于完整的海马体和腹侧纹状体(壳)功能的情境条件作用任务中测试了 MB-RL 代理,并表明它解决了任务,同时显示了 HC-vStr 回路的关键行为和神经元特征。我们的模拟还探索了在学习和控制过程中进行生物形式的前瞻性预测(前扫)的好处。因此,本文有助于填补我们对计算算法的理解和(基于模型的)强化学习的生物实现之间的差距。