奖励预测误差的神经回路

Neural Circuitry of Reward Prediction Error.

作者信息

Watabe-Uchida Mitsuko, Eshel Neir, Uchida Naoshige

机构信息

Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138; email:

Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305; email:

出版信息

Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.

DOI:10.1146/annurev-neuro-072116-031109

PMID:28441114

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6721851/

Abstract

Dopamine neurons facilitate learning by calculating reward prediction error, or the difference between expected and actual reward. Despite two decades of research, it remains unclear how dopamine neurons make this calculation. Here we review studies that tackle this problem from a diverse set of approaches, from anatomy to electrophysiology to computational modeling and behavior. Several patterns emerge from this synthesis: that dopamine neurons themselves calculate reward prediction error, rather than inherit it passively from upstream regions; that they combine multiple separate and redundant inputs, which are themselves interconnected in a dense recurrent network; and that despite the complexity of inputs, the output from dopamine neurons is remarkably homogeneous and robust. The more we study this simple arithmetic computation, the knottier it appears to be, suggesting a daunting (but stimulating) path ahead for neuroscience more generally.

摘要

多巴胺神经元通过计算奖励预测误差，即预期奖励与实际奖励之间的差异，来促进学习。尽管经过了二十年的研究，但多巴胺神经元如何进行这种计算仍不清楚。在这里，我们回顾了从解剖学、电生理学、计算建模到行为学等多种不同方法来解决这个问题的研究。从这种综合研究中出现了几种模式：多巴胺神经元自身计算奖励预测误差，而不是从上游区域被动继承；它们整合多个独立且冗余的输入，这些输入本身在密集的循环网络中相互连接；尽管输入复杂，但多巴胺神经元的输出却非常均匀且稳健。我们对这种简单算术计算研究得越多，它似乎就越复杂，这表明更广泛地说，神经科学面临着一条令人生畏（但令人兴奋）的道路。

相似文献

Neural Circuitry of Reward Prediction Error.

Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.

Dopamine reward prediction error coding.

Dialogues Clin Neurosci. 2016 Mar;18(1):23-32. doi: 10.31887/DCNS.2016.18.1/wschultz.

A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.

Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.

Arithmetic and local circuitry underlying dopamine prediction errors.

Nature. 2015 Sep 10;525(7568):243-6. doi: 10.1038/nature14855. Epub 2015 Aug 31.

Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning.

Eur J Neurosci. 2015 Aug;42(4):2003-21. doi: 10.1111/ejn.12994. Epub 2015 Jul 25.

Dopaminergic modulation of appetitive and aversive predictive learning.

Rev Neurosci. 2009;20(5-6):383-404. doi: 10.1515/revneuro.2009.20.5-6.383.

Models of heterogeneous dopamine signaling in an insect learning and memory center.

PLoS Comput Biol. 2021 Aug 10;17(8):e1009205. doi: 10.1371/journal.pcbi.1009205. eCollection 2021 Aug.

Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry.

Neuron. 2015 Oct 21;88(2):247-63. doi: 10.1016/j.neuron.2015.08.037.

Reward-dependent learning in neuronal networks for planning and decision making.

Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

引用本文的文献

Activation of the tail of the ventral tegmental area in response to pup predicting cues in maternal rats.

Brain Struct Funct. 2025 Jul 24;230(7):121. doi: 10.1007/s00429-025-02987-5.

Four individually identified paired dopamine neurons signal taste punishment in larval .

Elife. 2025 Jun 16;12:RP91387. doi: 10.7554/eLife.91387.

Salience signaling and stimulus scaling of ventral tegmental area glutamate neuron subtypes.

J Neurosci. 2025 Jun 5. doi: 10.1523/JNEUROSCI.1073-24.2025.

DENV2 and ZIKV modulate the feeding behavior of by altering the tyrosine-dopamine pathway.

mBio. 2025 Jun 11;16(6):e0396824. doi: 10.1128/mbio.03968-24. Epub 2025 Apr 29.

Dopamine induces fear extinction by activating the reward-responding amygdala neurons.

Proc Natl Acad Sci U S A. 2025 May 6;122(18):e2501331122. doi: 10.1073/pnas.2501331122. Epub 2025 Apr 28.

Examining the role of the photopigment melanopsin in the striatal dopamine response to light.

Front Syst Neurosci. 2025 Apr 2;19:1568878. doi: 10.3389/fnsys.2025.1568878. eCollection 2025.

Changes in neurotensin signalling drive hedonic devaluation in obesity.

Nature. 2025 Mar 26. doi: 10.1038/s41586-025-08748-y.

Predictive reward-prediction errors of climbing fiber inputs integrate modular reinforcement learning with supervised learning.

PLoS Comput Biol. 2025 Mar 17;21(3):e1012899. doi: 10.1371/journal.pcbi.1012899. eCollection 2025 Mar.

Altered Neural Activity in the Mesoaccumbens Pathway Underlies Impaired Social Reward Processing in Shank3-Deficient Rats.

Adv Sci (Weinh). 2025 May;12(17):e2414813. doi: 10.1002/advs.202414813. Epub 2025 Mar 14.

Fluorescence detection of dopamine signaling to the primate striatum in relation to stimulus-reward associations.

Proc Natl Acad Sci U S A. 2025 Mar 18;122(11):e2426861122. doi: 10.1073/pnas.2426861122. Epub 2025 Mar 13.

本文引用的文献

Midbrain dopamine neurons signal aversion in a reward-context-dependent manner.

Elife. 2016 Oct 19;5:e17328. doi: 10.7554/eLife.17328.

A basal ganglia circuit for evaluating action outcomes.

Nature. 2016 Nov 10;539(7628):289-293. doi: 10.1038/nature19845. Epub 2016 Sep 21.

Distributed and Mixed Information in Monosynaptic Inputs to Dopamine Neurons.

Neuron. 2016 Sep 21;91(6):1374-1389. doi: 10.1016/j.neuron.2016.08.018. Epub 2016 Sep 8.

Dopamine Neuron-Specific Optogenetic Stimulation in Rhesus Macaques.

Cell. 2016 Sep 8;166(6):1564-1571.e6. doi: 10.1016/j.cell.2016.08.024.

Rapid signalling in distinct dopaminergic axons during locomotion and reward.

Nature. 2016 Jul 28;535(7613):505-10. doi: 10.1038/nature18942. Epub 2016 Jul 11.

Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum.

Neuron. 2016 Jul 6;91(1):182-93. doi: 10.1016/j.neuron.2016.05.015. Epub 2016 Jun 9.

Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target.

Nat Neurosci. 2016 Jun;19(6):845-54. doi: 10.1038/nn.4287. Epub 2016 Apr 25.

Dopamine reward prediction error coding.

Dialogues Clin Neurosci. 2016 Mar;18(1):23-32. doi: 10.31887/DCNS.2016.18.1/wschultz.

Dopamine reward prediction-error signalling: a two-component response.

Nat Rev Neurosci. 2016 Mar;17(3):183-95. doi: 10.1038/nrn.2015.26. Epub 2016 Feb 11.

Dopamine neurons share common response function for reward prediction error.

Nat Neurosci. 2016 Mar;19(3):479-86. doi: 10.1038/nn.4239. Epub 2016 Feb 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

奖励预测误差的神经回路

Neural Circuitry of Reward Prediction Error.

作者信息

Watabe-Uchida Mitsuko, Eshel Neir, Uchida Naoshige

机构信息

Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138; email:

Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305; email:

出版信息

Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.

DOI:10.1146/annurev-neuro-072116-031109

PMID:28441114

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6721851/

Abstract

摘要

奖励预测误差的神经回路

Neural Circuitry of Reward Prediction Error.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

奖励预测误差的神经回路

Neural Circuitry of Reward Prediction Error.

作者信息

机构信息

出版信息