Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, California, USA.
J Neurophysiol. 2009 Dec;102(6):3384-91. doi: 10.1152/jn.91195.2008. Epub 2009 Sep 30.
Prediction error signals have been reported in human imaging studies in target areas of dopamine neurons such as ventral and dorsal striatum during learning with many different types of reinforcers. However, a key question that has yet to be addressed is whether prediction error signals recruit distinct or overlapping regions of striatum and elsewhere during learning with different types of reward. To address this, we scanned 17 healthy subjects with functional magnetic resonance imaging while they chose actions to obtain either a pleasant juice reward (1 ml apple juice), or a monetary gain (5 cents) and applied a computational reinforcement learning model to subjects' behavioral and imaging data. Evidence for an overlapping prediction error signal during learning with juice and money rewards was found in a region of dorsal striatum (caudate nucleus), while prediction error signals in a subregion of ventral striatum were significantly stronger during learning with money but not juice reward. These results provide evidence for partially overlapping reward prediction signals for different types of appetitive reinforcers within the striatum, a finding with important implications for understanding the nature of associative encoding in the striatum as a function of reinforcer type.
在使用多种不同类型的强化物进行学习时,人类影像学研究报告称在多巴胺神经元的目标区域(如腹侧和背侧纹状体)中存在预测误差信号。然而,一个尚未解决的关键问题是,在使用不同类型的奖励进行学习时,预测误差信号是否会招募纹状体和其他区域的不同或重叠区域。为了解决这个问题,我们对 17 名健康受试者进行了功能磁共振成像扫描,同时他们选择了行动来获得愉悦的果汁奖励(1 毫升苹果汁)或货币收益(5 美分),并将计算强化学习模型应用于受试者的行为和成像数据。在使用果汁和金钱奖励进行学习时,在背侧纹状体(尾状核)的一个区域中发现了重叠的预测误差信号的证据,而在腹侧纹状体的一个亚区中,预测误差信号在学习使用金钱奖励时明显更强,但在使用果汁奖励时则没有。这些结果为不同类型的奖赏预测信号在纹状体中存在部分重叠提供了证据,这一发现对于理解作为强化物类型函数的纹状体中的联想编码性质具有重要意义。