Preuschoff Kerstin, Quartz Steven R, Bossaerts Peter
Computation and Neural Systems Program and Division of Humanities and Social Sciences, Caltech, Pasadena, California 91125, USA.
J Neurosci. 2008 Mar 12;28(11):2745-52. doi: 10.1523/JNEUROSCI.4286-07.2008.
Understanding how organisms deal with probabilistic stimulus-reward associations has been advanced by a convergence between reinforcement learning models and primate physiology, which demonstrated that the brain encodes a reward prediction error signal. However, organisms must also predict the level of risk associated with reward forecasts, monitor the errors in those risk predictions, and update these in light of new information. Risk prediction serves a dual purpose: (1) to guide choice in risk-sensitive organisms and (2) to modulate learning of uncertain rewards. To date, it is not known whether or how the brain accomplishes risk prediction. Using functional imaging during a simple gambling task in which we constantly changed risk, we show that an early-onset activation in the human insula correlates significantly with risk prediction error and that its time course is consistent with a role in rapid updating. Additionally, we show that activation previously associated with general uncertainty emerges with a delay consistent with a role in risk prediction. The activations correlating with risk prediction and risk prediction errors are the analogy for risk of activations correlating with reward prediction and reward prediction errors for reward expectation. As such, our findings indicate that our understanding of the neural basis of reward anticipation under uncertainty needs to be expanded to include risk prediction.
强化学习模型与灵长类生理学的融合推动了对生物体如何处理概率性刺激-奖励关联的理解,这表明大脑会编码奖励预测误差信号。然而,生物体还必须预测与奖励预测相关的风险水平,监测这些风险预测中的误差,并根据新信息进行更新。风险预测具有双重目的:(1)指导对风险敏感的生物体做出选择;(2)调节对不确定奖励的学习。迄今为止,尚不清楚大脑是否以及如何完成风险预测。在一项简单的赌博任务中,我们不断改变风险,并利用功能成像技术,结果表明人类脑岛早期出现的激活与风险预测误差显著相关,并且其时间进程与快速更新的作用一致。此外,我们还表明,先前与一般不确定性相关的激活出现延迟,这与在风险预测中的作用一致。与风险预测和风险预测误差相关的激活类似于与奖励预测和奖励预期的奖励预测误差相关的激活的风险。因此,我们的研究结果表明,我们对不确定性下奖励预期的神经基础的理解需要扩展到包括风险预测。