Department of Psychology, New York University, 4 Washington Pl. Suite 888, New York, NY 10003, USA.
Eur J Neurosci. 2012 Apr;35(7):1011-23. doi: 10.1111/j.1460-9568.2011.07920.x.
Behavior may be generated on the basis of many different kinds of learned contingencies. For instance, responses could be guided by the direct association between a stimulus and response, or by sequential stimulus-stimulus relationships (as in model-based reinforcement learning or goal-directed actions). However, the neural architecture underlying sequential predictive learning is not well understood, in part because it is difficult to isolate its effect on choice behavior. To track such learning more directly, we examined reaction times (RTs) in a probabilistic sequential picture identification task in healthy individuals. We used computational learning models to isolate trial-by-trial effects of two distinct learning processes in behavior, and used these as signatures to analyse the separate neural substrates of each process. RTs were best explained via the combination of two delta rule learning processes with different learning rates. To examine neural manifestations of these learning processes, we used functional magnetic resonance imaging to seek correlates of time-series related to expectancy or surprise. We observed such correlates in two regions, hippocampus and striatum. By estimating the learning rates best explaining each signal, we verified that they were uniquely associated with one of the two distinct processes identified behaviorally. These differential correlates suggest that complementary anticipatory functions drive each region's effect on behavior. Our results provide novel insights as to the quantitative computational distinctions between medial temporal and basal ganglia learning networks and enable experiments that exploit trial-by-trial measurement of the unique contributions of both hippocampus and striatum to response behavior.
行为可能是基于许多不同类型的学习关联产生的。例如,反应可以由刺激和反应之间的直接关联来指导,也可以由顺序的刺激-刺激关系来指导(如基于模型的强化学习或目标导向的行动)。然而,序列预测学习的神经结构基础还不太清楚,部分原因是很难将其对选择行为的影响孤立出来。为了更直接地跟踪这种学习,我们在健康个体中检查了概率性序列图片识别任务中的反应时间(RT)。我们使用计算学习模型来分离行为中两种不同学习过程的逐次影响,并使用这些作为特征来分析每个过程的单独神经基质。RT 通过两种具有不同学习率的 delta 规则学习过程的组合得到了最好的解释。为了研究这些学习过程的神经表现,我们使用功能磁共振成像来寻找与预期或惊喜相关的时间序列的相关物。我们在两个区域(海马体和纹状体)中观察到了这样的相关物。通过估计最好地解释每个信号的学习率,我们验证了它们与行为中识别出的两个不同过程之一具有独特的相关性。这些差异相关物表明,互补的预期功能驱动着每个区域对行为的影响。我们的研究结果为内侧颞叶和基底神经节学习网络之间的定量计算区别提供了新的见解,并为利用海马体和纹状体对反应行为的独特贡献的逐次测量进行实验提供了可能。