• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多巴胺神经元中奖励预测反应的时间差分模型。

TD models of reward predictive responses in dopamine neurons.

作者信息

Suri Roland E

机构信息

Computational Neurobiology Laboratory, The Salk Institute, San Diego, CA 92186, USA.

出版信息

Neural Netw. 2002 Jun-Jul;15(4-6):523-33. doi: 10.1016/s0893-6080(02)00046-1.

DOI:10.1016/s0893-6080(02)00046-1
PMID:12371509
Abstract

This article focuses on recent modeling studies of dopamine neuron activity and their influence on behavior. Activity of midbrain dopamine neurons is phasically increased by stimuli that increase the animal's reward expectation and is decreased below baseline levels when the reward fails to occur. These characteristics resemble the reward prediction error signal of the temporal difference (TD) model, which is a model of reinforcement learning. Computational modeling studies show that such a dopamine-like reward prediction error can serve as a powerful teaching signal for learning with delayed reinforcement, in particular for learning of motor sequences. Several lines of evidence suggest that dopamine is also involved in 'cognitive' processes that are not addressed by standard TD models. I propose the hypothesis that dopamine neuron activity is crucial for planning processes, also referred to as 'goal-directed behavior', which select actions by evaluating predictions about their motivational outcomes.

摘要

本文聚焦于近期关于多巴胺能神经元活动及其对行为影响的建模研究。中脑多巴胺能神经元的活动会因增加动物奖励期望的刺激而阶段性增强,而当奖励未出现时,其活动会降至基线水平以下。这些特征类似于时间差分(TD)模型中的奖励预测误差信号,TD模型是一种强化学习模型。计算建模研究表明,这种类似多巴胺的奖励预测误差可作为延迟强化学习的有力教学信号,特别是对于运动序列的学习。多条证据表明,多巴胺还参与了标准TD模型未涉及的“认知”过程。我提出一个假说,即多巴胺能神经元活动对于计划过程(也称为“目标导向行为”)至关重要,计划过程通过评估关于其动机结果的预测来选择行动。

相似文献

1
TD models of reward predictive responses in dopamine neurons.多巴胺神经元中奖励预测反应的时间差分模型。
Neural Netw. 2002 Jun-Jul;15(4-6):523-33. doi: 10.1016/s0893-6080(02)00046-1.
2
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.一种具有类似多巴胺强化信号的神经网络模型,用于学习空间延迟反应任务。
Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.
3
Predictive reward signal of dopamine neurons.多巴胺神经元的预测性奖励信号。
J Neurophysiol. 1998 Jul;80(1):1-27. doi: 10.1152/jn.1998.80.1.1.
4
Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior.基底神经节和眶额皮质在目标导向行为中的参与。
Prog Brain Res. 2000;126:193-215. doi: 10.1016/S0079-6123(00)26015-9.
5
Temporal difference model reproduces anticipatory neural activity.时间差异模型再现预期神经活动。
Neural Comput. 2001 Apr;13(4):841-62. doi: 10.1162/089976601300014376.
6
Modeling functions of striatal dopamine modulation in learning and planning.纹状体多巴胺调节在学习和规划中的建模功能。
Neuroscience. 2001;103(1):65-85. doi: 10.1016/s0306-4522(00)00554-6.
7
Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task.在学习延迟反应任务的连续步骤中,猴子多巴胺神经元对奖励和条件刺激的反应。
J Neurosci. 1993 Mar;13(3):900-13. doi: 10.1523/JNEUROSCI.13-03-00900.1993.
8
Midbrain dopamine neurons signal phasic and ramping reward prediction error during goal-directed navigation.中脑多巴胺神经元在目标导向导航过程中信号传递相位和斜率奖励预测误差。
Cell Rep. 2022 Oct 11;41(2):111470. doi: 10.1016/j.celrep.2022.111470.
9
Dopamine neurons report an error in the temporal prediction of reward during learning.多巴胺神经元在学习过程中报告奖励时间预测的误差。
Nat Neurosci. 1998 Aug;1(4):304-9. doi: 10.1038/1124.
10
A Neural Circuit Mechanism for the Involvements of Dopamine in Effort-Related Choices: Decay of Learned Values, Secondary Effects of Depletion, and Calculation of Temporal Difference Error.多巴胺参与努力相关选择的神经回路机制:学习价值的衰减、耗竭的次要影响,以及时间差分误差的计算。
eNeuro. 2018 Feb 21;5(1). doi: 10.1523/ENEURO.0021-18.2018. eCollection 2018 Jan-Feb.

引用本文的文献

1
Investigating Transfer Learning in Noisy Environments: A Study of Predecessor and Successor Features in Spatial Learning Using a T-Maze.在嘈杂环境中进行迁移学习研究:使用 T 迷宫进行空间学习中的先驱和后继特征研究。
Sensors (Basel). 2024 Oct 3;24(19):6419. doi: 10.3390/s24196419.
2
Adapting hippocampus multi-scale place field distributions in cluttered environments optimizes spatial navigation and learning.在杂乱环境中调整海马体多尺度位置场分布可优化空间导航与学习。
Front Comput Neurosci. 2022 Dec 12;16:1039822. doi: 10.3389/fncom.2022.1039822. eCollection 2022.
3
Multiplexed action-outcome representation by striatal striosome-matrix compartments detected with a mouse cost-benefit foraging task.
纹状体棘状-基质隔室通过小鼠成本效益觅食任务检测到的多任务动作-结果表示。
Nat Commun. 2022 Mar 22;13(1):1541. doi: 10.1038/s41467-022-28983-5.
4
A systems-neuroscience model of phasic dopamine.相位多巴胺的系统神经科学模型。
Psychol Rev. 2020 Nov;127(6):972-1021. doi: 10.1037/rev0000199. Epub 2020 Jun 11.
5
A Computational Model of Dual Competition between the Basal Ganglia and the Cortex.基底神经节和皮层之间双重竞争的计算模型。
eNeuro. 2019 Jan 4;5(6). doi: 10.1523/ENEURO.0339-17.2018. eCollection 2018 Nov-Dec.
6
Affective-associative two-process theory: a neurocomputational account of partial reinforcement extinction effects.情感联想双过程理论:部分强化消退效应的神经计算解释。
Biol Cybern. 2017 Dec;111(5-6):365-388. doi: 10.1007/s00422-017-0730-1. Epub 2017 Sep 14.
7
Improving Robot Motor Learning with Negatively Valenced Reinforcement Signals.利用负价强化信号改善机器人运动学习
Front Neurorobot. 2017 Apr 3;11:10. doi: 10.3389/fnbot.2017.00010. eCollection 2017.
8
Functional Relevance of Different Basal Ganglia Pathways Investigated in a Spiking Model with Reward Dependent Plasticity.在具有奖励依赖可塑性的脉冲模型中研究不同基底神经节通路的功能相关性。
Front Neural Circuits. 2016 Jul 21;10:53. doi: 10.3389/fncir.2016.00053. eCollection 2016.
9
A Biologically Inspired Computational Model of Basal Ganglia in Action Selection.一种用于动作选择的受生物启发的基底神经节计算模型。
Comput Intell Neurosci. 2015;2015:187417. doi: 10.1155/2015/187417. Epub 2015 Nov 10.
10
Human substantia nigra neurons encode decision outcome and are modulated by categorization uncertainty in an auditory categorization task.在一项听觉分类任务中,人类黑质神经元对决策结果进行编码,并受分类不确定性的调节。
Physiol Rep. 2015 Sep;3(9). doi: 10.14814/phy2.12422.