个体差异与奖励预期和奖励预测误差的神经表现。

Individual differences and the neural representations of reward expectation and reward prediction error.

机构信息

Department of Epilepsy, University of Bonn, Sigmund-Freud-Strasse 25, Bonn, Germany.

出版信息

Soc Cogn Affect Neurosci. 2007 Mar;2(1):20-30. doi: 10.1093/scan/nsl021.

DOI:10.1093/scan/nsl021

PMID:17710118

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1945222/

Abstract

Reward expectation and reward prediction errors are thought to be critical for dynamic adjustments in decision-making and reward-seeking behavior, but little is known about their representation in the brain during uncertainty and risk-taking. Furthermore, little is known about what role individual differences might play in such reinforcement processes. In this study, it is shown behavioral and neural responses during a decision-making task can be characterized by a computational reinforcement learning model and that individual differences in learning parameters in the model are critical for elucidating these processes. In the fMRI experiment, subjects chose between high- and low-risk rewards. A computational reinforcement learning model computed expected values and prediction errors that each subject might experience on each trial. These outputs predicted subjects' trial-to-trial choice strategies and neural activity in several limbic and prefrontal regions during the task. Individual differences in estimated reinforcement learning parameters proved critical for characterizing these processes, because models that incorporated individual learning parameters explained significantly more variance in the fMRI data than did a model using fixed learning parameters. These findings suggest that the brain engages a reinforcement learning process during risk-taking and that individual differences play a crucial role in modeling this process.

摘要

奖励预期和奖励预测误差被认为对决策和寻求奖励行为的动态调整至关重要，但人们对它们在不确定和冒险情况下大脑中的表现知之甚少。此外，对于个体差异在这种强化过程中可能扮演什么角色，我们知之甚少。在这项研究中，研究表明，决策任务期间的行为和神经反应可以用计算强化学习模型来描述，并且模型中学习参数的个体差异对于阐明这些过程至关重要。在 fMRI 实验中，受试者在高风险和低风险奖励之间进行选择。计算强化学习模型计算了每个受试者在每次试验中可能经历的预期值和预测误差。这些输出预测了受试者在任务期间的逐次选择策略和几个边缘和前额叶区域的神经活动。估计强化学习参数的个体差异被证明对描述这些过程至关重要，因为包含个体学习参数的模型比使用固定学习参数的模型解释了 fMRI 数据中更多的方差。这些发现表明，大脑在冒险时会进行强化学习过程，而个体差异在对该过程进行建模时起着至关重要的作用。

相似文献

Individual differences and the neural representations of reward expectation and reward prediction error.个体差异与奖励预期和奖励预测误差的神经表现。

Soc Cogn Affect Neurosci. 2007 Mar;2(1):20-30. doi: 10.1093/scan/nsl021.

A computational neuroimaging study of reinforcement learning and goal-directed exploration in schizophrenia spectrum disorders.一项关于精神分裂症谱系障碍中强化学习和目标导向探索的计算神经影像学研究。

Psychol Med. 2023 Oct;53(14):6600-6610. doi: 10.1017/S0033291722003993. Epub 2023 Feb 8.

Prefrontal solution to the bias-variance tradeoff during reinforcement learning.前额叶解决强化学习中的偏差-方差权衡问题。

Cell Rep. 2021 Dec 28;37(13):110185. doi: 10.1016/j.celrep.2021.110185.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策：强化学习预测错误在人类中的快速传播。

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Neural substrates of updating the prediction through prediction error during decision making.决策过程中通过预测误差更新预测的神经基础。

Neuroimage. 2017 Aug 15;157:1-12. doi: 10.1016/j.neuroimage.2017.05.041. Epub 2017 May 20.

Policy adjustment in a dynamic economic game.动态经济博弈中的政策调整。

PLoS One. 2006 Dec 20;1(1):e103. doi: 10.1371/journal.pone.0000103.

Prediction errors drive dynamic changes in neural patterns that guide behavior.预测误差驱动神经模式的动态变化，从而指导行为。

Cell Rep. 2023 Aug 29;42(8):112931. doi: 10.1016/j.celrep.2023.112931. Epub 2023 Aug 3.

Choice, uncertainty and value in prefrontal and cingulate cortex.前额叶皮质和扣带皮质中的选择、不确定性与价值

Nat Neurosci. 2008 Apr;11(4):389-97. doi: 10.1038/nn2066. Epub 2008 Mar 26.

Dorsal-Ventral Reinforcement Learning Network Connectivity and Incentive-Driven Changes in Exploration.背腹侧强化学习网络连接性与探索中动机驱动的变化

J Neurosci. 2025 Apr 9;45(15):e0422242025. doi: 10.1523/JNEUROSCI.0422-24.2025.

Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments.多维环境中强化学习与注意力之间的动态交互

Neuron. 2017 Jan 18;93(2):451-463. doi: 10.1016/j.neuron.2016.12.040.

引用本文的文献

Value-Directed Remembering: A Dual-Process Perspective.价值导向记忆：一种双过程视角

Behav Sci (Basel). 2025 Aug 17;15(8):1113. doi: 10.3390/bs15081113.

Roles and interplay of reinforcement-based and error-based processes during reaching and gait in neurotypical adults and individuals with Parkinson's disease.在神经正常的成年人和帕金森病患者的伸手和步态中，基于强化和基于错误的过程的作用和相互作用。

PLoS Comput Biol. 2024 Oct 14;20(10):e1012474. doi: 10.1371/journal.pcbi.1012474. eCollection 2024 Oct.

A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI.一种用于在基于模型的功能磁共振成像中描绘学习率变化对奖励预测误差神经关联影响的新技术。

Front Psychol. 2023 Dec 21;14:1211528. doi: 10.3389/fpsyg.2023.1211528. eCollection 2023.

A neural and behavioral trade-off between value and uncertainty underlies exploratory decisions in normative anxiety.在规范性焦虑中，价值与不确定性之间的神经和行为权衡是探索性决策的基础。

Mol Psychiatry. 2022 Mar;27(3):1573-1587. doi: 10.1038/s41380-021-01363-z. Epub 2021 Nov 1.

Reward and fictive prediction error signals in ventral striatum: asymmetry between factual and counterfactual processing.腹侧纹状体中的奖励和虚构预测误差信号：事实和反事实处理之间的不对称性。

Brain Struct Funct. 2021 Jun;226(5):1553-1569. doi: 10.1007/s00429-021-02270-3. Epub 2021 Apr 11.

The Prisoner's Dilemma paradigm provides a neurobiological framework for the social decision cascade.囚徒困境范式为社会决策级联提供了神经生物学框架。

PLoS One. 2021 Mar 18;16(3):e0248006. doi: 10.1371/journal.pone.0248006. eCollection 2021.

Enhanced neural responses in specific phases of reward processing in individuals with Internet gaming disorder.网络游戏障碍个体在奖励加工特定阶段的神经反应增强。

J Behav Addict. 2021 Feb 10;10(1):99-111. doi: 10.1556/2006.2021.00003.

Social anxiety and dynamic social reinforcement learning in a volatile environment.社交焦虑与动荡环境中的动态社会强化学习

Clin Psychol Sci. 2019 Nov 1;7(6):1372-1388. doi: 10.1177/2167702619858425. Epub 2019 Sep 20.

Acute Alcohol Intake Produces Widespread Decreases in Cortical Resting Signal Variability in Healthy Social Drinkers.急性酒精摄入导致健康社交饮酒者大脑皮质静息态信号变异性广泛降低。

Alcohol Clin Exp Res. 2020 Jul;44(7):1410-1419. doi: 10.1111/acer.14381. Epub 2020 Jun 18.

Time-frequency approaches to investigating changes in feedback processing during childhood and adolescence.基于时频分析的方法探究儿童和青少年时期反馈加工的变化。

Psychophysiology. 2018 Oct;55(10):e13208. doi: 10.1111/psyp.13208. Epub 2018 Aug 15.

本文引用的文献

Reinforcement learning signals predict future decisions.强化学习信号预测未来决策。

J Neurosci. 2007 Jan 10;27(2):371-8. doi: 10.1523/JNEUROSCI.4421-06.2007.

Separate brain regions code for salience vs. valence during reward prediction in humans.在人类奖励预测过程中，不同的脑区分别编码显著性与效价。

Hum Brain Mapp. 2007 Apr;28(4):294-302. doi: 10.1002/hbm.20274.

Prediction error as a linear function of reward probability is coded in human nucleus accumbens.预测误差作为奖励概率的线性函数在人类伏隔核中被编码。

Neuroimage. 2006 Jun;31(2):790-5. doi: 10.1016/j.neuroimage.2006.01.001. Epub 2006 Feb 17.

Predictive neural coding of reward preference involves dissociable responses in human ventral midbrain and ventral striatum.奖励偏好的预测性神经编码涉及人类腹侧中脑和腹侧纹状体中不同的反应。

Neuron. 2006 Jan 5;49(1):157-66. doi: 10.1016/j.neuron.2005.11.014.

Anterior cingulate activity modulates nonlinear decision weight function of uncertain prospects.前扣带回活动调节不确定前景的非线性决策权重函数。

Neuroimage. 2006 Apr 1;30(2):668-77. doi: 10.1016/j.neuroimage.2005.09.061.

Representation of action-specific reward values in the striatum.纹状体中特定动作奖励值的表征。

Science. 2005 Nov 25;310(5752):1337-40. doi: 10.1126/science.1115270.

Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning.在刺激-动作-奖励关联学习过程中，壳核和尾状核中奖励期望与奖励期望误差的不同神经关联。

J Neurophysiol. 2006 Feb;95(2):948-59. doi: 10.1152/jn.00382.2005. Epub 2005 Sep 28.

Behavioral and neural predictors of upcoming decisions.即将做出决策的行为和神经预测因素。

Cogn Affect Behav Neurosci. 2005 Jun;5(2):117-26. doi: 10.3758/cabn.5.2.117.

Learning and decision making in monkeys during a rock-paper-scissors game.猴子在玩剪刀石头布游戏时的学习与决策

Brain Res Cogn Brain Res. 2005 Oct;25(2):416-30. doi: 10.1016/j.cogbrainres.2005.07.003. Epub 2005 Aug 10.

Ventral-striatal/nucleus-accumbens sensitivity to prediction errors during classification learning.腹侧纹状体/伏隔核在分类学习过程中对预测误差的敏感性。

Hum Brain Mapp. 2006 Apr;27(4):306-13. doi: 10.1002/hbm.20186.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验