• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Dopamine neurons learn to encode the long-term value of multiple future rewards.多巴胺神经元学会编码多个未来奖励的长期价值。
Proc Natl Acad Sci U S A. 2011 Sep 13;108(37):15462-7. doi: 10.1073/pnas.1014457108. Epub 2011 Sep 6.
2
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.一种具有类似多巴胺强化信号的神经网络模型,用于学习空间延迟反应任务。
Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.
3
Midbrain dopamine neurons encode a quantitative reward prediction error signal.中脑多巴胺神经元编码一种定量奖励预测误差信号。
Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.
4
Predictive reward signal of dopamine neurons.多巴胺神经元的预测性奖励信号。
J Neurophysiol. 1998 Jul;80(1):1-27. doi: 10.1152/jn.1998.80.1.1.
5
Coding of the long-term value of multiple future rewards in the primate striatum.灵长类纹状体中多个未来奖励的长期价值编码。
J Neurophysiol. 2013 Feb;109(4):1140-51. doi: 10.1152/jn.00289.2012. Epub 2012 Nov 21.
6
Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner.多巴胺错误以特定于结果的方式驱动反向条件作用的兴奋性和抑制性成分。
Curr Biol. 2022 Jul 25;32(14):3210-3218.e3. doi: 10.1016/j.cub.2022.06.035. Epub 2022 Jun 24.
7
Dopamine neurons report an error in the temporal prediction of reward during learning.多巴胺神经元在学习过程中报告奖励时间预测的误差。
Nat Neurosci. 1998 Aug;1(4):304-9. doi: 10.1038/1124.
8
Dopamine neurons learn relative chosen value from probabilistic rewards.多巴胺神经元从概率性奖励中学习相对选择价值。
Elife. 2016 Oct 27;5:e18044. doi: 10.7554/eLife.18044.
9
The cost of obtaining rewards enhances the reward prediction error signal of midbrain dopamine neurons.获得奖励的成本增强了中脑多巴胺神经元的奖励预测误差信号。
Nat Commun. 2019 Aug 15;10(1):3674. doi: 10.1038/s41467-019-11334-2.
10
Dopamine prediction error responses integrate subjective value from different reward dimensions.多巴胺预测误差反应整合了来自不同奖励维度的主观价值。
Proc Natl Acad Sci U S A. 2014 Feb 11;111(6):2343-8. doi: 10.1073/pnas.1321596111. Epub 2014 Jan 22.

引用本文的文献

1
Corticobulbar activity in healthy humans and Parkinson's disease: a study protocol for a novel biomarker of motivational arousal.健康人和帕金森病患者的皮质延髓活动:一项关于动机唤醒新生物标志物的研究方案
Front Psychol. 2025 Jul 17;16:1573534. doi: 10.3389/fpsyg.2025.1573534. eCollection 2025.
2
Trial-by-trial learning of successor representations in human behavior.人类行为中后继表征的逐次试验学习。
bioRxiv. 2025 Jun 16:2024.11.07.622528. doi: 10.1101/2024.11.07.622528.
3
Distributed representations of temporally accumulated reward prediction errors in the mouse cortex.小鼠皮层中时间累积奖励预测误差的分布式表征。
Sci Adv. 2025 Jan 24;11(4):eadi4782. doi: 10.1126/sciadv.adi4782. Epub 2025 Jan 22.
4
A dopamine mechanism for reward maximization.多巴胺奖赏最大化机制。
Proc Natl Acad Sci U S A. 2024 May 14;121(20):e2316658121. doi: 10.1073/pnas.2316658121. Epub 2024 May 8.
5
The role of prospective contingency in the control of behavior and dopamine signals during associative learning.前瞻性偶然性在联想学习过程中对行为和多巴胺信号的控制作用。
bioRxiv. 2024 Feb 6:2024.02.05.578961. doi: 10.1101/2024.02.05.578961.
6
Predictions about reward outcomes in rhesus monkeys.预测猕猴的奖励结果。
Behav Neurosci. 2024 Feb;138(1):43-58. doi: 10.1037/bne0000573. Epub 2023 Dec 7.
7
Neural correlates of episodic memory modulated by temporally delayed rewards.时间延迟奖励调节的情景记忆的神经相关物。
PLoS One. 2021 Apr 7;16(4):e0249290. doi: 10.1371/journal.pone.0249290. eCollection 2021.
8
Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys.全球奖励状态会影响猴子中缝核和前岛叶的学习和活动。
Nat Commun. 2020 Jul 28;11(1):3771. doi: 10.1038/s41467-020-17343-w.
9
Topographic distinction in long-term value signals between presumed dopamine neurons and presumed striatal projection neurons in behaving monkeys.在行为猴子中,假定的多巴胺神经元和假定的纹状体投射神经元之间的长期价值信号的拓扑区分。
Sci Rep. 2020 Jun 2;10(1):8912. doi: 10.1038/s41598-020-65914-0.
10
Recent advances in understanding the role of phasic dopamine activity.理解阶段性多巴胺活动作用方面的最新进展。
F1000Res. 2019 Sep 24;8. doi: 10.12688/f1000research.19793.1. eCollection 2019.

本文引用的文献

1
Dopamine, time, and impulsivity in humans.人类的多巴胺、时间和冲动。
J Neurosci. 2010 Jun 30;30(26):8888-96. doi: 10.1523/JNEUROSCI.6028-09.2010.
2
Separating value from choice: delay discounting activity in the lateral intraparietal area.从选择中分离价值:顶内沟外侧部的延迟折扣活动。
J Neurosci. 2010 Apr 21;30(16):5498-507. doi: 10.1523/JNEUROSCI.5742-09.2010.
3
An "as soon as possible" effect in human intertemporal decision making: behavioral evidence and neural mechanisms.人类跨期决策中的“尽快”效应:行为证据与神经机制。
J Neurophysiol. 2010 May;103(5):2513-31. doi: 10.1152/jn.00177.2009. Epub 2010 Feb 24.
4
Hyperbolically discounted temporal difference learning.超贴现时间差分学习。
Neural Comput. 2010 Jun;22(6):1511-27. doi: 10.1162/neco.2010.08-09-1080.
5
Midbrain dopamine neurons signal preference for advance information about upcoming rewards.中脑多巴胺神经元对即将到来的奖励的提前信息表现出信号偏好。
Neuron. 2009 Jul 16;63(1):119-26. doi: 10.1016/j.neuron.2009.06.009.
6
Two types of dopamine neuron distinctly convey positive and negative motivational signals.两种类型的多巴胺神经元分别传递积极和消极的动机信号。
Nature. 2009 Jun 11;459(7248):837-41. doi: 10.1038/nature08028. Epub 2009 May 17.
7
Influence of reward delays on responses of dopamine neurons.奖励延迟对多巴胺神经元反应的影响。
J Neurosci. 2008 Jul 30;28(31):7837-46. doi: 10.1523/JNEUROSCI.1600-08.2008.
8
The temporal precision of reward prediction in dopamine neurons.多巴胺神经元中奖励预测的时间精度。
Nat Neurosci. 2008 Aug;11(8):966-73. doi: 10.1038/nn.2159.
9
Prefrontal coding of temporally discounted values during intertemporal choice.跨期选择过程中时间折扣价值的前额叶编码
Neuron. 2008 Jul 10;59(1):161-72. doi: 10.1016/j.neuron.2008.05.010.
10
Value representations in the primate striatum during matching behavior.灵长类动物在匹配行为期间纹状体中的价值表征。
Neuron. 2008 May 8;58(3):451-63. doi: 10.1016/j.neuron.2008.02.021.

多巴胺神经元学会编码多个未来奖励的长期价值。

Dopamine neurons learn to encode the long-term value of multiple future rewards.

机构信息

Department of Physiology, Kyoto Prefectural University of Medicine, Kyoto 602-8566, Japan.

出版信息

Proc Natl Acad Sci U S A. 2011 Sep 13;108(37):15462-7. doi: 10.1073/pnas.1014457108. Epub 2011 Sep 6.

DOI:10.1073/pnas.1014457108
PMID:21896766
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3174584/
Abstract

Midbrain dopamine neurons signal reward value, their prediction error, and the salience of events. If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories. Here, we address this experimentally untested issue. We recorded 185 dopamine neurons in three monkeys that performed a multistep choice task in which they explored a reward target among alternatives and then exploited that knowledge to receive one or two additional rewards by choosing the same target in a set of subsequent trials. An analysis of anticipatory licking for reward water indicated that the monkeys did not anticipate an immediately expected reward in individual trials; rather, they anticipated the sum of immediate and multiple future rewards. In accordance with this behavioral observation, the dopamine responses to the start cues and reinforcer beeps reflected the expected values of the multiple future rewards and their errors, respectively. More specifically, when monkeys learned the multistep choice task over the course of several weeks, the responses of dopamine neurons encoded the sum of the immediate and expected multiple future rewards. The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning. These findings demonstrate that dopamine neurons learn to encode the long-term value of multiple future rewards with distant rewards discounted.

摘要

中脑多巴胺神经元信号传递奖励价值、预测误差和事件的显著程度。如果它们在实现特定的长远目标中起着关键作用,正如强化学习理论所建议的那样,那么长期的未来奖励也应该被编码。在这里,我们解决了这个实验尚未验证的问题。我们在三只猴子中记录了 185 个多巴胺神经元,它们在一个多步选择任务中表现出色,在该任务中,它们在替代方案中探索奖励目标,然后通过在后续的一系列试验中选择相同的目标来利用这些知识获得一个或两个额外的奖励。对奖励水的预期舔舐的分析表明,猴子在单个试验中并没有预期立即得到奖励;相反,它们预期的是即时和多个未来奖励的总和。与这一行为观察一致,多巴胺对起始线索和强化器哔哔声的反应分别反映了多个未来奖励的预期价值及其误差。更具体地说,当猴子在数周的时间里学习多步选择任务时,多巴胺神经元的反应编码了即时奖励和预期的多个未来奖励的总和。多巴胺反应可以通过强化学习中具有时间折扣的价值函数的理论描述进行定量预测。这些发现表明,多巴胺神经元学会用遥远的奖励折扣来编码多个未来奖励的长期价值。