中脑多巴胺神经元编码一种定量奖励预测误差信号。

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

作者信息

Bayer Hannah M, Glimcher Paul W

机构信息

Center for Neural Science, New York University, New York, NY 10003, USA.

出版信息

Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.

DOI:10.1016/j.neuron.2005.05.020

PMID:15996553

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1564381/

Abstract

The midbrain dopamine neurons are hypothesized to provide a physiological correlate of the reward prediction error signal required by current models of reinforcement learning. We examined the activity of single dopamine neurons during a task in which subjects learned by trial and error when to make an eye movement for a juice reward. We found that these neurons encoded the difference between the current reward and a weighted average of previous rewards, a reward prediction error, but only for outcomes that were better than expected. Thus, the firing rate of midbrain dopamine neurons is quantitatively predicted by theoretical descriptions of the reward prediction error signal used in reinforcement learning models for circumstances in which this signal has a positive value. We also found that the dopamine system continued to compute the reward prediction error even when the behavioral policy of the animal was only weakly influenced by this computation.

摘要

中脑多巴胺神经元被假定为提供当前强化学习模型所需的奖励预测误差信号的生理相关物。我们在一项任务中检测了单个多巴胺神经元的活动，在该任务中，受试者通过反复试验来学习何时进行眼动以获得果汁奖励。我们发现，这些神经元编码了当前奖励与先前奖励的加权平均值之间的差异，即奖励预测误差，但仅针对优于预期的结果。因此，中脑多巴胺神经元的放电频率可通过强化学习模型中用于该信号具有正值情况的奖励预测误差信号的理论描述进行定量预测。我们还发现，即使动物的行为策略仅受到该计算的微弱影响，多巴胺系统仍会继续计算奖励预测误差。

相似文献

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.

Dopamine neurons can represent context-dependent prediction error.

Neuron. 2004 Jan 22;41(2):269-80. doi: 10.1016/s0896-6273(03)00869-9.

Midbrain dopamine neurons signal preference for advance information about upcoming rewards.

Neuron. 2009 Jul 16;63(1):119-26. doi: 10.1016/j.neuron.2009.06.009.

Axiomatic methods, dopamine and reward prediction error.

Curr Opin Neurobiol. 2008 Apr;18(2):197-202. doi: 10.1016/j.conb.2008.07.007. Epub 2008 Aug 12.

Statistics of midbrain dopamine neuron spike trains in the awake primate.

J Neurophysiol. 2007 Sep;98(3):1428-39. doi: 10.1152/jn.01140.2006. Epub 2007 Jul 5.

The cost of obtaining rewards enhances the reward prediction error signal of midbrain dopamine neurons.

Nat Commun. 2019 Aug 15;10(1):3674. doi: 10.1038/s41467-019-11334-2.

A possible role of midbrain dopamine neurons in short- and long-term adaptation of saccades to position-reward mapping.

J Neurophysiol. 2004 Oct;92(4):2520-9. doi: 10.1152/jn.00238.2004. Epub 2004 May 26.

Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior.

Prog Brain Res. 2000;126:193-215. doi: 10.1016/S0079-6123(00)26015-9.

J Neurosci. 2003 Oct 29;23(30):9913-23. doi: 10.1523/JNEUROSCI.23-30-09913.2003.

Reward-predictive cues enhance excitatory synaptic strength onto midbrain dopamine neurons.

Science. 2008 Sep 19;321(5896):1690-2. doi: 10.1126/science.1160873.

引用本文的文献

Identification of conserved frontal neurophysiological markers of cognitive flexibility in humans and rats.

Commun Biol. 2025 Aug 23;8(1):1268. doi: 10.1038/s42003-025-08729-x.

Tonic dopamine and biases in value learning linked through a biologically inspired reinforcement learning model.

Nat Commun. 2025 Aug 13;16(1):7529. doi: 10.1038/s41467-025-62280-1.

Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning.

PLoS Comput Biol. 2025 Jul 2;21(7):e1013226. doi: 10.1371/journal.pcbi.1013226. eCollection 2025 Jul.

Differential Associations of Dopamine and Serotonin With Reward and Punishment Processes in Humans: A Systematic Review and Meta-Analysis.

JAMA Psychiatry. 2025 Jun 11. doi: 10.1001/jamapsychiatry.2025.0839.

Modulation of Dopamine Neurons Alters Behavior and Event Encoding in the Nucleus Accumbens during Pavlovian Conditioning.

J Neurosci. 2025 Jun 25;45(26):e0061252025. doi: 10.1523/JNEUROSCI.0061-25.2025.

The Computational Bottleneck of Basal Ganglia Output (and What to Do About it).

eNeuro. 2025 Apr 24;12(4). doi: 10.1523/ENEURO.0431-23.2024. Print 2025 Apr.

Dopamine prediction error signaling in a unique nigrostriatal circuit is critical for associative fear learning.

Nat Commun. 2025 Mar 29;16(1):3066. doi: 10.1038/s41467-025-58382-5.

Predictive reward-prediction errors of climbing fiber inputs integrate modular reinforcement learning with supervised learning.

PLoS Comput Biol. 2025 Mar 17;21(3):e1012899. doi: 10.1371/journal.pcbi.1012899. eCollection 2025 Mar.

Interpretable deep learning for deconvolutional analysis of neural signals.

Neuron. 2025 Apr 16;113(8):1151-1168.e13. doi: 10.1016/j.neuron.2025.02.006. Epub 2025 Mar 12.

Natural behaviour is learned through dopamine-mediated reinforcement.

Nature. 2025 May;641(8063):699-706. doi: 10.1038/s41586-025-08729-1. Epub 2025 Mar 12.

本文引用的文献

By carrot or by stick: cognitive reinforcement learning in parkinsonism.

Science. 2004 Dec 10;306(5703):1940-3. doi: 10.1126/science.1102941. Epub 2004 Nov 4.

Dopamine neurons can represent context-dependent prediction error.

Neuron. 2004 Jan 22;41(2):269-80. doi: 10.1016/s0896-6273(03)00869-9.

J Neurosci. 2003 Oct 29;23(30):9913-23. doi: 10.1523/JNEUROSCI.23-30-09913.2003.

Reward-predicting activity of dopamine and caudate neurons--a possible mechanism of motivational control of saccadic eye movement.

J Neurophysiol. 2004 Feb;91(2):1013-24. doi: 10.1152/jn.00721.2003. Epub 2003 Oct 1.

Discrete coding of reward probability and uncertainty by dopamine neurons.

Science. 2003 Mar 21;299(5614):1898-902. doi: 10.1126/science.1077349.

Tryptophan depletion alters the decision-making of healthy volunteers through altered processing of reward cues.

Neuropsychopharmacology. 2003 Jan;28(1):153-62. doi: 10.1038/sj.npp.1300001.

Neural economics and the biological substrates of valuation.

Neuron. 2002 Oct 10;36(2):265-84. doi: 10.1016/s0896-6273(02)00974-1.

Opponent interactions between serotonin and dopamine.

Neural Netw. 2002 Jun-Jul;15(4-6):603-16. doi: 10.1016/s0893-6080(02)00052-7.

Learning and memory functions of the Basal Ganglia.

Annu Rev Neurosci. 2002;25:563-93. doi: 10.1146/annurev.neuro.25.112701.142937. Epub 2002 Mar 27.

Framing effects and risky decisions in starlings.

Proc Natl Acad Sci U S A. 2002 Mar 5;99(5):3352-5. doi: 10.1073/pnas.042491999. Epub 2002 Feb 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

中脑多巴胺神经元编码一种定量奖励预测误差信号。

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

作者信息

Bayer Hannah M, Glimcher Paul W

机构信息

Center for Neural Science, New York University, New York, NY 10003, USA.

出版信息

Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.

DOI:10.1016/j.neuron.2005.05.020

PMID:15996553

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1564381/

Abstract

摘要

中脑多巴胺神经元编码一种定量奖励预测误差信号。

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

中脑多巴胺神经元编码一种定量奖励预测误差信号。

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

作者信息

机构信息

出版信息