Department of Psychology and Neuroscience, Duke University, Durham, NC, USA.
Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
Trends Cogn Sci. 2018 Oct;22(10):911-922. doi: 10.1016/j.tics.2018.08.004.
We present an integrated view of interval timing and reinforcement learning (RL) in the brain. The computational goal of RL is to maximize future rewards, and this depends crucially on a representation of time. Different RL systems in the brain process time in distinct ways. A model-based system learns 'what happens when', employing this internal model to generate action plans, while a model-free system learns to predict reward directly from a set of temporal basis functions. We describe how these systems are subserved by a computational division of labor between several brain regions, with a focus on the basal ganglia and the hippocampus, as well as how these regions are influenced by the neuromodulator dopamine.
我们提出了一种大脑中区间定时和强化学习 (RL) 的综合观点。RL 的计算目标是最大化未来奖励,这关键取决于时间的表示。大脑中的不同 RL 系统以不同的方式处理时间。基于模型的系统学习“当什么发生时”,使用该内部模型生成行动计划,而无模型的系统则学习直接从一组时间基函数预测奖励。我们描述了这些系统如何由大脑的几个区域之间的计算分工来支持,重点是基底神经节和海马体,以及这些区域如何受到神经调质多巴胺的影响。