多巴胺依赖的预测误差是人类寻求奖励行为的基础。

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

作者信息

Pessiglione Mathias, Seymour Ben, Flandin Guillaume, Dolan Raymond J, Frith Chris D

机构信息

Wellcome Department of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK.

出版信息

Nature. 2006 Aug 31;442(7106):1042-5. doi: 10.1038/nature05051. Epub 2006 Aug 23.

DOI:10.1038/nature05051

PMID:16929307

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2636869/

Abstract

Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions. These theories highlight a central role for reward prediction errors in updating the values associated with available actions. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-L-phenylalanine; L-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with L-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard action-value learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.

摘要

工具性学习理论的核心在于理解成功与失败是如何被用于改进未来决策的。这些理论强调了奖励预测误差在更新与可用行动相关联的价值方面的核心作用。在动物身上，大量证据表明神经递质多巴胺可能在这类学习中具有关键作用，通过其调节皮质 - 纹状体突触效能的能力。然而，在人类中，尚无直接证据将多巴胺、纹状体活动和行为选择联系起来。在此我们表明，在工具性学习过程中，纹状体中表达的奖励预测误差的大小会受到增强（3,4 - 二羟基 - L - 苯丙氨酸；L - 多巴）或降低（氟哌啶醇）多巴胺能功能的药物给药的调节。相应地，与接受氟哌啶醇治疗的受试者相比，接受L - 多巴治疗的受试者更倾向于选择最具奖励性的行动。此外，将预测误差的大小纳入标准行动价值学习算法能够准确重现不同药物条件下受试者的行为选择。我们得出结论，多巴胺对纹状体活动依赖性的调节能够解释人类大脑如何利用奖励预测误差来改进未来决策。

相似文献

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

Nature. 2006 Aug 31;442(7106):1042-5. doi: 10.1038/nature05051. Epub 2006 Aug 23.

L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action.

Neuroimage. 2019 Feb 1;186:113-125. doi: 10.1016/j.neuroimage.2018.10.075. Epub 2018 Oct 28.

Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum.

J Neurosci. 2017 Feb 15;37(7):1708-1720. doi: 10.1523/JNEUROSCI.1979-16.2016.

Pharmacological modulation of subliminal learning in Parkinson's and Tourette's syndromes.

Proc Natl Acad Sci U S A. 2009 Nov 10;106(45):19179-84. doi: 10.1073/pnas.0904035106. Epub 2009 Oct 22.

Intrinsically regulated learning is modulated by synaptic dopamine signaling.

Elife. 2018 Aug 30;7:e38113. doi: 10.7554/eLife.38113.

Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol.

J Neurosci. 2020 Oct 7;40(41):7936-7948. doi: 10.1523/JNEUROSCI.0592-20.2020. Epub 2020 Sep 18.

The central aromatic amino acid DOPA decarboxylase inhibitor, NSD-1015, does not inhibit L-DOPA-induced circling in unilateral 6-OHDA-lesioned-rats.

Eur J Neurosci. 2001 Jan;13(1):162-70. doi: 10.1046/j.0953-816x.2000.01370.x.

Hemispheric Asymmetries in Striatal Reward Responses Relate to Approach-Avoidance Learning and Encoding of Positive-Negative Prediction Errors in Dopaminergic Midbrain Regions.

J Neurosci. 2015 Oct 28;35(43):14491-500. doi: 10.1523/JNEUROSCI.1859-15.2015.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Evidence that haloperidol impairs learning and motivation scores in a probabilistic task by reducing the reward expectation.

Behav Brain Res. 2020 Oct 1;395:112858. doi: 10.1016/j.bbr.2020.112858. Epub 2020 Aug 15.

引用本文的文献

Separable neural signals for reward and emotion prediction errors.

Nat Commun. 2025 Aug 22;16(1):7849. doi: 10.1038/s41467-025-63135-5.

Pharmacological and pupillary evidence for the noradrenergic contribution to reinforcement learning in Parkinson's disease.

Commun Biol. 2025 Aug 14;8(1):1223. doi: 10.1038/s42003-025-08627-2.

Effects of Early Adversity and War Trauma on Learning Under Uncertainty.

Dev Sci. 2025 Sep;28(5):e70049. doi: 10.1111/desc.70049.

Higher-order and distributed synergistic functional interactions encode information gain in goal-directed learning.

Nat Commun. 2025 Aug 5;16(1):7179. doi: 10.1038/s41467-025-62507-1.

Impaired effort allocation in schizophrenia.

Schizophr Res Cogn. 2025 Jul 15;42:100378. doi: 10.1016/j.scog.2025.100378. eCollection 2025 Dec.

Basal ganglia activation localized in MEG using a reward task.

Neuroimage Rep. 2021 Jul 28;1(3):100034. doi: 10.1016/j.ynirp.2021.100034. eCollection 2021 Sep.

Differential Associations of Dopamine and Serotonin With Reward and Punishment Processes in Humans: A Systematic Review and Meta-Analysis.

JAMA Psychiatry. 2025 Jun 11. doi: 10.1001/jamapsychiatry.2025.0839.

Impaired reinforcement learning and coding of prediction errors in patients with cerebellar degeneration - a study with EEG and voxel-based morphometry.

Cogn Affect Behav Neurosci. 2025 May 28. doi: 10.3758/s13415-025-01303-2.

Effects of 28-day simvastatin administration on emotional processing, reward learning, working memory, and salivary cortisol in healthy participants at-risk for depression: OxSTEP, an online experimental medicine trial.

Psychol Med. 2025 May 22;55:e155. doi: 10.1017/S0033291725001187.

Computational modelling and neural correlates of reinforcement learning following three-week escitalopram: a double-blind, placebo-controlled semi-randomised study.

Transl Psychiatry. 2025 May 21;15(1):175. doi: 10.1038/s41398-025-03392-6.

本文引用的文献

Representation of action-specific reward values in the striatum.

Science. 2005 Nov 25;310(5752):1337-40. doi: 10.1126/science.1115270.

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.

Distributed neural representation of expected value.

J Neurosci. 2005 May 11;25(19):4806-12. doi: 10.1523/JNEUROSCI.0642-05.2005.

Motor control in basal ganglia circuits using fMRI and brain atlas approaches.

Cereb Cortex. 2006 Feb;16(2):149-61. doi: 10.1093/cercor/bhi089. Epub 2005 Apr 27.

By carrot or by stick: cognitive reinforcement learning in parkinsonism.

Science. 2004 Dec 10;306(5703):1940-3. doi: 10.1126/science.1102941. Epub 2004 Nov 4.

Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops.

Nat Neurosci. 2004 Aug;7(8):887-93. doi: 10.1038/nn1279. Epub 2004 Jul 4.

Temporal difference models describe higher-order learning in humans.

Nature. 2004 Jun 10;429(6992):664-7. doi: 10.1038/nature02581.

Dopamine, learning and motivation.

Nat Rev Neurosci. 2004 Jun;5(6):483-94. doi: 10.1038/nrn1406.

Dissociable roles of ventral and dorsal striatum in instrumental conditioning.

Science. 2004 Apr 16;304(5669):452-4. doi: 10.1126/science.1094285.

Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli.

Science. 2004 Mar 26;303(5666):2040-2. doi: 10.1126/science.1093360.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多巴胺依赖的预测误差是人类寻求奖励行为的基础。

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

作者信息

Pessiglione Mathias, Seymour Ben, Flandin Guillaume, Dolan Raymond J, Frith Chris D

机构信息

Wellcome Department of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK.

出版信息

Nature. 2006 Aug 31;442(7106):1042-5. doi: 10.1038/nature05051. Epub 2006 Aug 23.

DOI:10.1038/nature05051

PMID:16929307

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2636869/

Abstract

摘要

多巴胺依赖的预测误差是人类寻求奖励行为的基础。

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

多巴胺依赖的预测误差是人类寻求奖励行为的基础。

Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

作者信息

机构信息

出版信息