Lohrenz Terry, McCabe Kevin, Camerer Colin F, Montague P Read
Department of Neuroscience, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.
Proc Natl Acad Sci U S A. 2007 May 29;104(22):9493-8. doi: 10.1073/pnas.0608842104. Epub 2007 May 22.
Reinforcement learning models now provide principled guides for a wide range of reward learning experiments in animals and humans. One key learning (error) signal in these models is experiential and reports ongoing temporal differences between expected and experienced reward. However, these same abstract learning models also accommodate the existence of another class of learning signal that takes the form of a fictive error encoding ongoing differences between experienced returns and returns that "could-have-been-experienced" if decisions had been different. These observations suggest the hypothesis that, for all real-world learning tasks, one should expect the presence of both experiential and fictive learning signals. Motivated by this possibility, we used a sequential investment game and fMRI to probe ongoing brain responses to both experiential and fictive learning signals generated throughout the game. Using a large cohort of subjects (n = 54), we report that fictive learning signals strongly predict changes in subjects' investment behavior and correlate with fMRI signals measured in dopaminoceptive structures known to be involved in valuation and choice.
强化学习模型现在为动物和人类的广泛奖励学习实验提供了原则性指导。这些模型中的一个关键学习(误差)信号是经验性的,它报告了预期奖励和实际获得奖励之间持续的时间差异。然而,这些相同的抽象学习模型也考虑到了另一类学习信号的存在,这类信号以虚构误差的形式出现,编码了实际收益与如果决策不同“本可以经历”的收益之间的持续差异。这些观察结果提出了一个假设,即对于所有现实世界的学习任务,人们应该预期同时存在经验性和虚构性学习信号。受这种可能性的启发,我们使用了一个序列投资游戏和功能磁共振成像来探究大脑对游戏过程中产生的经验性和虚构性学习信号的持续反应。通过一大群受试者(n = 54),我们报告虚构性学习信号强烈预测受试者投资行为的变化,并与已知参与估值和选择的多巴胺感受结构中测量的功能磁共振成像信号相关。