多巴胺预测误差：对奖励学习联想模型的贡献。

The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning.

作者信息

Nasser Helen M, Calu Donna J, Schoenbaum Geoffrey, Sharpe Melissa J

机构信息

Department of Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore MD, USA.

Department of Anatomy and Neurobiology, University of Maryland School of Medicine, BaltimoreMD, USA; Cellular Neurobiology Research Branch, National Institute on Drug Abuse Intramural Research Program, BaltimoreMD, USA; Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, BaltimoreMD, USA.

出版信息

Front Psychol. 2017 Feb 22;8:244. doi: 10.3389/fpsyg.2017.00244. eCollection 2017.

DOI:10.3389/fpsyg.2017.00244

PMID:28275359

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5319959/

Abstract

Phasic activity of midbrain dopamine neurons is currently thought to encapsulate the prediction-error signal described in Sutton and Barto's (1981) model-free reinforcement learning algorithm. This phasic signal is thought to contain information about the quantitative value of reward, which transfers to the reward-predictive cue after learning. This is argued to endow the reward-predictive cue with the value inherent in the reward, motivating behavior toward cues signaling the presence of reward. Yet theoretical and empirical research has implicated prediction-error signaling in learning that extends far beyond a transfer of quantitative value to a reward-predictive cue. Here, we review the research which demonstrates the complexity of how dopaminergic prediction errors facilitate learning. After briefly discussing the literature demonstrating that phasic dopaminergic signals can act in the manner described by Sutton and Barto (1981), we consider how these signals may also influence attentional processing across multiple attentional systems in distinct brain circuits. Then, we discuss how prediction errors encode and promote the development of context-specific associations between cues and rewards. Finally, we consider recent evidence that shows dopaminergic activity contains information about causal relationships between cues and rewards that reflect information garnered from rich associative models of the world that can be adapted in the absence of direct experience. In discussing this research we hope to support the expansion of how dopaminergic prediction errors are thought to contribute to the learning process beyond the traditional concept of transferring quantitative value.

摘要

目前认为，中脑多巴胺能神经元的相位性活动体现了萨顿和巴托（1981年）提出的无模型强化学习算法中所描述的预测误差信号。这种相位性信号被认为包含有关奖励定量价值的信息，在学习后会传递给奖励预测线索。有人认为，这赋予了奖励预测线索奖励所固有的价值，从而激发了针对预示奖励存在的线索的行为。然而，理论和实证研究表明，预测误差信号在学习中的作用远不止于将定量价值传递给奖励预测线索。在这里，我们回顾了相关研究，这些研究展示了多巴胺能预测误差促进学习的复杂性。在简要讨论了表明相位性多巴胺能信号可以按照萨顿和巴托（1981年）所描述的方式起作用的文献之后，我们考虑这些信号如何也可能影响不同脑回路中多个注意力系统的注意力加工。然后，我们讨论预测误差如何编码并促进线索与奖励之间特定情境关联的发展。最后，我们考虑最近的证据，这些证据表明多巴胺能活动包含有关线索与奖励之间因果关系的信息，这些信息反映了从丰富的世界联想模型中获取的信息，并且在没有直接经验的情况下也可以进行调整。在讨论这项研究时，我们希望支持对多巴胺能预测误差如何促进学习过程的理解的扩展，使其超越传统的定量价值传递概念。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a3e/5319959/068487c43175/fpsyg-08-00244-g001.jpg

相似文献

The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning.多巴胺预测误差：对奖励学习联想模型的贡献。

Front Psychol. 2017 Feb 22;8:244. doi: 10.3389/fpsyg.2017.00244. eCollection 2017.

Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner.多巴胺错误以特定于结果的方式驱动反向条件作用的兴奋性和抑制性成分。

Curr Biol. 2022 Jul 25;32(14):3210-3218.e3. doi: 10.1016/j.cub.2022.06.035. Epub 2022 Jun 24.

Predictive reward signal of dopamine neurons.多巴胺神经元的预测性奖励信号。

J Neurophysiol. 1998 Jul;80(1):1-27. doi: 10.1152/jn.1998.80.1.1.

Evaluation of the hypothesis that phasic dopamine constitutes a cached-value signal.对阶段性多巴胺构成一种缓存值信号这一假说的评估。

Neurobiol Learn Mem. 2018 Sep;153(Pt B):131-136. doi: 10.1016/j.nlm.2017.12.002. Epub 2017 Dec 18.

Tonic or Phasic Stimulation of Dopaminergic Projections to Prefrontal Cortex Causes Mice to Maintain or Deviate from Previously Learned Behavioral Strategies.对前额叶皮层多巴胺能投射的强直或相位刺激使小鼠维持或偏离先前习得的行为策略。

J Neurosci. 2017 Aug 30;37(35):8315-8329. doi: 10.1523/JNEUROSCI.1221-17.2017. Epub 2017 Jul 24.

A causal link between prediction errors, dopamine neurons and learning.预测误差、多巴胺神经元和学习之间的因果关系。

Nat Neurosci. 2013 Jul;16(7):966-73. doi: 10.1038/nn.3413. Epub 2013 May 26.

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features.光遗传学阻断多巴胺瞬变可防止因奖励特征变化引起的学习。

Curr Biol. 2017 Nov 20;27(22):3480-3486.e3. doi: 10.1016/j.cub.2017.09.049. Epub 2017 Nov 2.

Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum.多巴胺调节人类中脑和纹状体中的适应性预测误差编码。

J Neurosci. 2017 Feb 15;37(7):1708-1720. doi: 10.1523/JNEUROSCI.1979-16.2016.

Dopamine transients do not act as model-free prediction errors during associative learning.多巴胺瞬变在联想学习中不作为无模型预测误差。

Nat Commun. 2020 Jan 8;11(1):106. doi: 10.1038/s41467-019-13953-1.

The timing of action determines reward prediction signals in identified midbrain dopamine neurons.动作的时机决定了中脑多巴胺神经元中奖励预测信号的时间。

Nat Neurosci. 2018 Nov;21(11):1563-1573. doi: 10.1038/s41593-018-0245-7. Epub 2018 Oct 15.

引用本文的文献

Charting the brain networks of impulsivity: Meta-analytic synthesis, functional connectivity modelling, and neurotransmitter associations.绘制冲动性的脑网络：元分析综合、功能连接建模及神经递质关联

Imaging Neurosci (Camb). 2024 Sep 25;2. doi: 10.1162/imag_a_00295. eCollection 2024.

Social aloofness is associated with non-social explore-exploit decisions.社交冷漠与非社交性的探索-利用决策有关。

Commun Psychol. 2025 Jul 15;3(1):106. doi: 10.1038/s44271-025-00278-7.

Unpredictable Drug Access and its Relevance for Substance Use Disorders: A Critical Review.不可预测的药物获取及其与物质使用障碍的相关性：一项批判性综述。

Perspect Behav Sci. 2025 Jun 2;48(2):367-387. doi: 10.1007/s40614-025-00449-1. eCollection 2025 Jun.

Electrical brain activations in preadolescents during a probabilistic reward-learning task reflect cognitive processes and behavior strategies.青春期前儿童在概率性奖励学习任务中的脑电激活反映了认知过程和行为策略。

Front Hum Neurosci. 2025 Jan 30;19:1460584. doi: 10.3389/fnhum.2025.1460584. eCollection 2025.

Patterns of neural activity in prelimbic cortex neurons correlate with attentional behavior in the rodent continuous performance test.前边缘皮层神经元的神经活动模式与啮齿动物持续性操作测验中的注意力行为相关。

bioRxiv. 2024 Jul 26:2024.07.26.605300. doi: 10.1101/2024.07.26.605300.

Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories.多巴胺投射到基底外侧杏仁核驱动身份特异性奖励记忆的编码。

Nat Neurosci. 2024 Apr;27(4):728-736. doi: 10.1038/s41593-024-01586-7. Epub 2024 Feb 23.

Asymmetric coding of reward prediction errors in human insula and dorsomedial prefrontal cortex.人类脑岛和背内侧前额皮质中奖励预测误差的非对称编码。

Nat Commun. 2023 Dec 21;14(1):8520. doi: 10.1038/s41467-023-44248-1.

Dopaminergic D2-like receptor stimulation affects attention on contextual information and modulates BOLD activation of extinction-related brain areas.多巴胺 D2 样受体刺激影响对上下文信息的注意力，并调节与消退相关的大脑区域的 BOLD 激活。

Sci Rep. 2023 Nov 28;13(1):21003. doi: 10.1038/s41598-023-47704-6.

Dynamics of Lateral Habenula-Ventral Tegmental Area Microcircuit on Pain-Related Cognitive Dysfunctions.外侧缰核-腹侧被盖区微回路在疼痛相关认知功能障碍中的动态变化

Neurol Int. 2023 Oct 27;15(4):1303-1319. doi: 10.3390/neurolint15040082.

Role of Dopamine Neurons in Familiarity.多巴胺神经元在熟悉感中的作用。

bioRxiv. 2023 Oct 25:2023.10.25.564006. doi: 10.1101/2023.10.25.564006.

本文引用的文献

Ventral tegmental area: cellular heterogeneity, connectivity and behaviour.腹侧被盖区：细胞异质性、连接和行为。

Nat Rev Neurosci. 2017 Feb;18(2):73-85. doi: 10.1038/nrn.2016.165. Epub 2017 Jan 5.

Learning, Reward, and Decision Making.学习、奖励与决策制定。

Annu Rev Psychol. 2017 Jan 3;68:73-100. doi: 10.1146/annurev-psych-010416-044216. Epub 2016 Sep 28.

Dopamine Neuron-Specific Optogenetic Stimulation in Rhesus Macaques.恒河猴中多巴胺神经元特异性光遗传学刺激

Cell. 2016 Sep 8;166(6):1564-1571.e6. doi: 10.1016/j.cell.2016.08.024.

VTA dopaminergic neurons regulate ethologically relevant sleep-wake behaviors.腹侧被盖区多巴胺能神经元调节与行为学相关的睡眠-觉醒行为。

Nat Neurosci. 2016 Oct;19(10):1356-66. doi: 10.1038/nn.4377. Epub 2016 Sep 5.

Attention and associative learning in humans: An integrative review.人类的注意和联想学习：综合评述。

Psychol Bull. 2016 Oct;142(10):1111-1140. doi: 10.1037/bul0000064. Epub 2016 Aug 8.

Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum.大鼠腹侧被盖区中假定多巴胺能神经元发出的奖励预测误差的时间特异性取决于腹侧纹状体。

Neuron. 2016 Jul 6;91(1):182-93. doi: 10.1016/j.neuron.2016.05.015. Epub 2016 Jun 9.

Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target.中脑多巴胺神经元终末的奖赏与选择编码取决于纹状体靶点。

Nat Neurosci. 2016 Jun;19(6):845-54. doi: 10.1038/nn.4287. Epub 2016 Apr 25.

Divergent Routing of Positive and Negative Information from the Amygdala during Memory Retrieval.记忆检索过程中杏仁核正负信息的发散性路由

Neuron. 2016 Apr 20;90(2):348-361. doi: 10.1016/j.neuron.2016.03.004. Epub 2016 Mar 31.

Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework.中脑多巴胺神经元在一个通用框架中计算推断和缓存的价值预测误差。

Elife. 2016 Mar 7;5:e13665. doi: 10.7554/eLife.13665.

Mini-review: Prediction errors, attention and associative learning.小型综述：预测误差、注意力与联想学习

Neurobiol Learn Mem. 2016 May;131:207-15. doi: 10.1016/j.nlm.2016.02.014. Epub 2016 Mar 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

多巴胺预测误差：对奖励学习联想模型的贡献。

The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献