Department of Experimental Psychology, Otto-von-Guericke-Universität Magdeburg, Postfach 4120, D-39016 Magdeburg, Germany.
Neuroimage. 2012 Feb 15;59(4):3457-67. doi: 10.1016/j.neuroimage.2011.11.058. Epub 2011 Nov 30.
Research on the neural bases of learning has mainly focused on reinforcement learning where the central role of the dopaminergic system is well established. However, in everyday life many decisions are not followed by feedback, in which case humans have been shown to code the most probable outcome into memory. We used functional magnetic resonance imaging (fMRI) to examine the neural basis of internally generated signals on correctness and decision confidence in the complete absence of feedback in a categorization task. During test trials after observational training activation in dopaminergic target regions was modulated by the correctness of the answer similarly as during feedback-based training. Moreover, activation in the nucleus accumbens and putamen was correlated with the prediction error on confidence as estimated by a reinforcement learning model. In this model subjective confidence ratings acquired after each trial served as outcome measure. Activation in the striatum therefore follows a similar pattern in response to prediction errors on confidence as it does during reinforcement learning in response to reward prediction errors, but with respect to internally generated signals based on knowledge of the structure of the environment. Furthermore, ventral striatal activation decreased with stimulus novelty, which might support the allocation of attention to unfamiliar stimuli. These results provide a parsimonious account for the neural bases of learning, indicating overlapping neural substrates of reinforcement learning and learning when outcome information has to be internally constructed.
学习的神经基础研究主要集中在强化学习上,其中多巴胺能系统的核心作用已得到充分证实。然而,在日常生活中,许多决策并没有得到反馈,在这种情况下,人类已经被证明可以将最有可能的结果编码到记忆中。我们使用功能磁共振成像(fMRI)来研究在分类任务中完全没有反馈的情况下,内部产生的信号对正确性和决策信心的神经基础。在观察训练后的测试试验中,多巴胺能靶点区域的激活与答案的正确性有关,这与基于反馈的训练相似。此外,伏隔核和壳核的激活与强化学习模型估计的信心预测误差有关。在这个模型中,每次试验后的主观信心评分作为结果测量。因此,纹状体的激活与在强化学习中对奖励预测误差的反应相似,但与基于对环境结构的了解的内部产生的信号有关。此外,腹侧纹状体的激活随着刺激新颖性的增加而降低,这可能支持对不熟悉刺激的注意力分配。这些结果为学习的神经基础提供了一个简洁的解释,表明强化学习和在必须内部构建结果信息时的学习具有重叠的神经基础。