Department of Epilepsy, University of Bonn, Sigmund-Freud-Strasse 25, Bonn, Germany.
Soc Cogn Affect Neurosci. 2007 Mar;2(1):20-30. doi: 10.1093/scan/nsl021.
Reward expectation and reward prediction errors are thought to be critical for dynamic adjustments in decision-making and reward-seeking behavior, but little is known about their representation in the brain during uncertainty and risk-taking. Furthermore, little is known about what role individual differences might play in such reinforcement processes. In this study, it is shown behavioral and neural responses during a decision-making task can be characterized by a computational reinforcement learning model and that individual differences in learning parameters in the model are critical for elucidating these processes. In the fMRI experiment, subjects chose between high- and low-risk rewards. A computational reinforcement learning model computed expected values and prediction errors that each subject might experience on each trial. These outputs predicted subjects' trial-to-trial choice strategies and neural activity in several limbic and prefrontal regions during the task. Individual differences in estimated reinforcement learning parameters proved critical for characterizing these processes, because models that incorporated individual learning parameters explained significantly more variance in the fMRI data than did a model using fixed learning parameters. These findings suggest that the brain engages a reinforcement learning process during risk-taking and that individual differences play a crucial role in modeling this process.
奖励预期和奖励预测误差被认为对决策和寻求奖励行为的动态调整至关重要,但人们对它们在不确定和冒险情况下大脑中的表现知之甚少。此外,对于个体差异在这种强化过程中可能扮演什么角色,我们知之甚少。在这项研究中,研究表明,决策任务期间的行为和神经反应可以用计算强化学习模型来描述,并且模型中学习参数的个体差异对于阐明这些过程至关重要。在 fMRI 实验中,受试者在高风险和低风险奖励之间进行选择。计算强化学习模型计算了每个受试者在每次试验中可能经历的预期值和预测误差。这些输出预测了受试者在任务期间的逐次选择策略和几个边缘和前额叶区域的神经活动。估计强化学习参数的个体差异被证明对描述这些过程至关重要,因为包含个体学习参数的模型比使用固定学习参数的模型解释了 fMRI 数据中更多的方差。这些发现表明,大脑在冒险时会进行强化学习过程,而个体差异在对该过程进行建模时起着至关重要的作用。