整合区间定时和强化学习模型。

Integrating Models of Interval Timing and Reinforcement Learning.

机构信息

Department of Psychology and Neuroscience, Duke University, Durham, NC, USA.

Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.

出版信息

Trends Cogn Sci. 2018 Oct;22(10):911-922. doi: 10.1016/j.tics.2018.08.004.

DOI:10.1016/j.tics.2018.08.004

PMID:30266150

Abstract

We present an integrated view of interval timing and reinforcement learning (RL) in the brain. The computational goal of RL is to maximize future rewards, and this depends crucially on a representation of time. Different RL systems in the brain process time in distinct ways. A model-based system learns 'what happens when', employing this internal model to generate action plans, while a model-free system learns to predict reward directly from a set of temporal basis functions. We describe how these systems are subserved by a computational division of labor between several brain regions, with a focus on the basal ganglia and the hippocampus, as well as how these regions are influenced by the neuromodulator dopamine.

摘要

我们提出了一种大脑中区间定时和强化学习 (RL) 的综合观点。RL 的计算目标是最大化未来奖励，这关键取决于时间的表示。大脑中的不同 RL 系统以不同的方式处理时间。基于模型的系统学习“当什么发生时”，使用该内部模型生成行动计划，而无模型的系统则学习直接从一组时间基函数预测奖励。我们描述了这些系统如何由大脑的几个区域之间的计算分工来支持，重点是基底神经节和海马体，以及这些区域如何受到神经调质多巴胺的影响。