• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多巴胺依赖的预测误差是人类寻求奖励行为的基础。

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

作者信息

Pessiglione Mathias, Seymour Ben, Flandin Guillaume, Dolan Raymond J, Frith Chris D

机构信息

Wellcome Department of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK.

出版信息

Nature. 2006 Aug 31;442(7106):1042-5. doi: 10.1038/nature05051. Epub 2006 Aug 23.

DOI:10.1038/nature05051
PMID:16929307
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2636869/
Abstract

Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions. These theories highlight a central role for reward prediction errors in updating the values associated with available actions. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-L-phenylalanine; L-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with L-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard action-value learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.

摘要

工具性学习理论的核心在于理解成功与失败是如何被用于改进未来决策的。这些理论强调了奖励预测误差在更新与可用行动相关联的价值方面的核心作用。在动物身上,大量证据表明神经递质多巴胺可能在这类学习中具有关键作用,通过其调节皮质 - 纹状体突触效能的能力。然而,在人类中,尚无直接证据将多巴胺、纹状体活动和行为选择联系起来。在此我们表明,在工具性学习过程中,纹状体中表达的奖励预测误差的大小会受到增强(3,4 - 二羟基 - L - 苯丙氨酸;L - 多巴)或降低(氟哌啶醇)多巴胺能功能的药物给药的调节。相应地,与接受氟哌啶醇治疗的受试者相比,接受L - 多巴治疗的受试者更倾向于选择最具奖励性的行动。此外,将预测误差的大小纳入标准行动价值学习算法能够准确重现不同药物条件下受试者的行为选择。我们得出结论,多巴胺对纹状体活动依赖性的调节能够解释人类大脑如何利用奖励预测误差来改进未来决策。

相似文献

1
Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.多巴胺依赖的预测误差是人类寻求奖励行为的基础。
Nature. 2006 Aug 31;442(7106):1042-5. doi: 10.1038/nature05051. Epub 2006 Aug 23.
2
L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action.左旋多巴通过减弱价值向行动的传递来减少无模型控制行为。
Neuroimage. 2019 Feb 1;186:113-125. doi: 10.1016/j.neuroimage.2018.10.075. Epub 2018 Oct 28.
3
Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum.多巴胺调节人类中脑和纹状体中的适应性预测误差编码。
J Neurosci. 2017 Feb 15;37(7):1708-1720. doi: 10.1523/JNEUROSCI.1979-16.2016.
4
Pharmacological modulation of subliminal learning in Parkinson's and Tourette's syndromes.药物调节帕金森病和妥瑞氏综合征的潜意识学习。
Proc Natl Acad Sci U S A. 2009 Nov 10;106(45):19179-84. doi: 10.1073/pnas.0904035106. Epub 2009 Oct 22.
5
Intrinsically regulated learning is modulated by synaptic dopamine signaling.内在调节学习受突触多巴胺信号的调节。
Elife. 2018 Aug 30;7:e38113. doi: 10.7554/eLife.38113.
6
Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol.多巴胺能调节人类跨期选择:使用 D2 受体拮抗剂氟哌啶醇的扩散模型分析。
J Neurosci. 2020 Oct 7;40(41):7936-7948. doi: 10.1523/JNEUROSCI.0592-20.2020. Epub 2020 Sep 18.
7
The central aromatic amino acid DOPA decarboxylase inhibitor, NSD-1015, does not inhibit L-DOPA-induced circling in unilateral 6-OHDA-lesioned-rats.中枢芳香族氨基酸多巴脱羧酶抑制剂NSD - 1015并不抑制左旋多巴诱导的单侧6 - 羟基多巴胺损伤大鼠的转圈行为。
Eur J Neurosci. 2001 Jan;13(1):162-70. doi: 10.1046/j.0953-816x.2000.01370.x.
8
Hemispheric Asymmetries in Striatal Reward Responses Relate to Approach-Avoidance Learning and Encoding of Positive-Negative Prediction Errors in Dopaminergic Midbrain Regions.纹状体奖赏反应中的半球不对称与多巴胺能中脑区域的趋近-回避学习及正负预测误差的编码有关。
J Neurosci. 2015 Oct 28;35(43):14491-500. doi: 10.1523/JNEUROSCI.1859-15.2015.
9
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
10
Evidence that haloperidol impairs learning and motivation scores in a probabilistic task by reducing the reward expectation.证据表明氟哌啶醇通过降低奖励预期来损害概率任务中的学习和动机得分。
Behav Brain Res. 2020 Oct 1;395:112858. doi: 10.1016/j.bbr.2020.112858. Epub 2020 Aug 15.

引用本文的文献

1
Separable neural signals for reward and emotion prediction errors.用于奖励和情绪预测误差的可分离神经信号。
Nat Commun. 2025 Aug 22;16(1):7849. doi: 10.1038/s41467-025-63135-5.
2
Pharmacological and pupillary evidence for the noradrenergic contribution to reinforcement learning in Parkinson's disease.帕金森病中去甲肾上腺素能对强化学习作用的药理学及瞳孔证据。
Commun Biol. 2025 Aug 14;8(1):1223. doi: 10.1038/s42003-025-08627-2.
3
Effects of Early Adversity and War Trauma on Learning Under Uncertainty.早期逆境和战争创伤对不确定性下学习的影响。
Dev Sci. 2025 Sep;28(5):e70049. doi: 10.1111/desc.70049.
4
Higher-order and distributed synergistic functional interactions encode information gain in goal-directed learning.高阶和分布式协同功能相互作用在目标导向学习中编码信息增益。
Nat Commun. 2025 Aug 5;16(1):7179. doi: 10.1038/s41467-025-62507-1.
5
Impaired effort allocation in schizophrenia.精神分裂症中努力分配受损。
Schizophr Res Cogn. 2025 Jul 15;42:100378. doi: 10.1016/j.scog.2025.100378. eCollection 2025 Dec.
6
Basal ganglia activation localized in MEG using a reward task.使用奖励任务在脑磁图中定位基底神经节激活。
Neuroimage Rep. 2021 Jul 28;1(3):100034. doi: 10.1016/j.ynirp.2021.100034. eCollection 2021 Sep.
7
Differential Associations of Dopamine and Serotonin With Reward and Punishment Processes in Humans: A Systematic Review and Meta-Analysis.多巴胺和血清素与人类奖励和惩罚过程的差异关联:一项系统综述和荟萃分析。
JAMA Psychiatry. 2025 Jun 11. doi: 10.1001/jamapsychiatry.2025.0839.
8
Impaired reinforcement learning and coding of prediction errors in patients with cerebellar degeneration - a study with EEG and voxel-based morphometry.小脑变性患者强化学习受损及预测误差编码——一项脑电图和基于体素的形态学研究
Cogn Affect Behav Neurosci. 2025 May 28. doi: 10.3758/s13415-025-01303-2.
9
Effects of 28-day simvastatin administration on emotional processing, reward learning, working memory, and salivary cortisol in healthy participants at-risk for depression: OxSTEP, an online experimental medicine trial.28天服用辛伐他汀对有抑郁症风险的健康参与者情绪加工、奖赏学习、工作记忆及唾液皮质醇的影响:OxSTEP,一项在线实验性医学试验
Psychol Med. 2025 May 22;55:e155. doi: 10.1017/S0033291725001187.
10
Computational modelling and neural correlates of reinforcement learning following three-week escitalopram: a double-blind, placebo-controlled semi-randomised study.三周艾司西酞普兰治疗后强化学习的计算模型与神经关联:一项双盲、安慰剂对照半随机研究
Transl Psychiatry. 2025 May 21;15(1):175. doi: 10.1038/s41398-025-03392-6.

本文引用的文献

1
Representation of action-specific reward values in the striatum.纹状体中特定动作奖励值的表征。
Science. 2005 Nov 25;310(5752):1337-40. doi: 10.1126/science.1115270.
2
Midbrain dopamine neurons encode a quantitative reward prediction error signal.中脑多巴胺神经元编码一种定量奖励预测误差信号。
Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.
3
Distributed neural representation of expected value.预期值的分布式神经表征。
J Neurosci. 2005 May 11;25(19):4806-12. doi: 10.1523/JNEUROSCI.0642-05.2005.
4
Motor control in basal ganglia circuits using fMRI and brain atlas approaches.使用功能磁共振成像和脑图谱方法研究基底神经节回路中的运动控制。
Cereb Cortex. 2006 Feb;16(2):149-61. doi: 10.1093/cercor/bhi089. Epub 2005 Apr 27.
5
By carrot or by stick: cognitive reinforcement learning in parkinsonism.胡萝卜还是大棒:帕金森病中的认知强化学习
Science. 2004 Dec 10;306(5703):1940-3. doi: 10.1126/science.1102941. Epub 2004 Nov 4.
6
Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops.对即时和未来奖励的预测会不同程度地激活皮质-基底神经节回路。
Nat Neurosci. 2004 Aug;7(8):887-93. doi: 10.1038/nn1279. Epub 2004 Jul 4.
7
Temporal difference models describe higher-order learning in humans.时间差分模型描述了人类的高阶学习。
Nature. 2004 Jun 10;429(6992):664-7. doi: 10.1038/nature02581.
8
Dopamine, learning and motivation.多巴胺、学习与动机。
Nat Rev Neurosci. 2004 Jun;5(6):483-94. doi: 10.1038/nrn1406.
9
Dissociable roles of ventral and dorsal striatum in instrumental conditioning.腹侧和背侧纹状体在工具性条件反射中的不同作用。
Science. 2004 Apr 16;304(5669):452-4. doi: 10.1126/science.1094285.
10
Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli.厌恶刺激对腹侧被盖区多巴胺能神经元的均匀抑制。
Science. 2004 Mar 26;303(5666):2040-2. doi: 10.1126/science.1093360.