Suppr超能文献

多巴胺预测误差:对奖励学习联想模型的贡献。

The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning.

作者信息

Nasser Helen M, Calu Donna J, Schoenbaum Geoffrey, Sharpe Melissa J

机构信息

Department of Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore MD, USA.

Department of Anatomy and Neurobiology, University of Maryland School of Medicine, BaltimoreMD, USA; Cellular Neurobiology Research Branch, National Institute on Drug Abuse Intramural Research Program, BaltimoreMD, USA; Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, BaltimoreMD, USA.

出版信息

Front Psychol. 2017 Feb 22;8:244. doi: 10.3389/fpsyg.2017.00244. eCollection 2017.

Abstract

Phasic activity of midbrain dopamine neurons is currently thought to encapsulate the prediction-error signal described in Sutton and Barto's (1981) model-free reinforcement learning algorithm. This phasic signal is thought to contain information about the quantitative value of reward, which transfers to the reward-predictive cue after learning. This is argued to endow the reward-predictive cue with the value inherent in the reward, motivating behavior toward cues signaling the presence of reward. Yet theoretical and empirical research has implicated prediction-error signaling in learning that extends far beyond a transfer of quantitative value to a reward-predictive cue. Here, we review the research which demonstrates the complexity of how dopaminergic prediction errors facilitate learning. After briefly discussing the literature demonstrating that phasic dopaminergic signals can act in the manner described by Sutton and Barto (1981), we consider how these signals may also influence attentional processing across multiple attentional systems in distinct brain circuits. Then, we discuss how prediction errors encode and promote the development of context-specific associations between cues and rewards. Finally, we consider recent evidence that shows dopaminergic activity contains information about causal relationships between cues and rewards that reflect information garnered from rich associative models of the world that can be adapted in the absence of direct experience. In discussing this research we hope to support the expansion of how dopaminergic prediction errors are thought to contribute to the learning process beyond the traditional concept of transferring quantitative value.

摘要

目前认为,中脑多巴胺能神经元的相位性活动体现了萨顿和巴托(1981年)提出的无模型强化学习算法中所描述的预测误差信号。这种相位性信号被认为包含有关奖励定量价值的信息,在学习后会传递给奖励预测线索。有人认为,这赋予了奖励预测线索奖励所固有的价值,从而激发了针对预示奖励存在的线索的行为。然而,理论和实证研究表明,预测误差信号在学习中的作用远不止于将定量价值传递给奖励预测线索。在这里,我们回顾了相关研究,这些研究展示了多巴胺能预测误差促进学习的复杂性。在简要讨论了表明相位性多巴胺能信号可以按照萨顿和巴托(1981年)所描述的方式起作用的文献之后,我们考虑这些信号如何也可能影响不同脑回路中多个注意力系统的注意力加工。然后,我们讨论预测误差如何编码并促进线索与奖励之间特定情境关联的发展。最后,我们考虑最近的证据,这些证据表明多巴胺能活动包含有关线索与奖励之间因果关系的信息,这些信息反映了从丰富的世界联想模型中获取的信息,并且在没有直接经验的情况下也可以进行调整。在讨论这项研究时,我们希望支持对多巴胺能预测误差如何促进学习过程的理解的扩展,使其超越传统的定量价值传递概念。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a3e/5319959/068487c43175/fpsyg-08-00244-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验