• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多巴胺系统理论中的表征与时机

Representation and timing in theories of the dopamine system.

作者信息

Daw Nathaniel D, Courville Aaron C, Touretzky David S

机构信息

UCL, Gatsby Computational Neuroscience Unit, London, WC1N3AR, UK.

出版信息

Neural Comput. 2006 Jul;18(7):1637-77. doi: 10.1162/neco.2006.18.7.1637.

DOI:10.1162/neco.2006.18.7.1637
PMID:16764517
Abstract

Although the responses of dopamine neurons in the primate midbrain are well characterized as carrying a temporal difference (TD) error signal for reward prediction, existing theories do not offer a credible account of how the brain keeps track of past sensory events that may be relevant to predicting future reward. Empirically, these shortcomings of previous theories are particularly evident in their account of experiments in which animals were exposed to variation in the timing of events. The original theories mispredicted the results of such experiments due to their use of a representational device called a tapped delay line. Here we propose that a richer understanding of history representation and a better account of these experiments can be given by considering TD algorithms for a formal setting that incorporates two features not originally considered in theories of the dopaminergic response: partial observability (a distinction between the animal's sensory experience and the true underlying state of the world) and semi-Markov dynamics (an explicit account of variation in the intervals between events). The new theory situates the dopaminergic system in a richer functional and anatomical context, since it assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data. It also accounts for a range of experimental data, including the experiments involving programmed temporal variability and other previously unmodeled dopaminergic response phenomena, which we suggest are related to subjective noise in animals' interval timing. Finally, it offers new experimental predictions and a rich theoretical framework for designing future experiments.

摘要

尽管灵长类动物中脑多巴胺神经元的反应被很好地描述为携带用于奖励预测的时间差(TD)误差信号,但现有理论并未对大脑如何追踪可能与预测未来奖励相关的过去感官事件给出可信的解释。从经验上看,先前理论的这些缺点在它们对动物暴露于事件时间变化的实验的解释中尤为明显。由于使用了一种称为抽头延迟线的表示装置,原始理论错误地预测了此类实验的结果。在这里,我们提出,通过考虑一种形式设置的TD算法,可以对历史表示有更丰富的理解,并更好地解释这些实验,该形式设置包含了多巴胺能反应理论中最初未考虑的两个特征:部分可观测性(动物的感官体验与世界的真实潜在状态之间的区别)和半马尔可夫动力学(对事件间隔变化的明确解释)。新理论将多巴胺能系统置于更丰富的功能和解剖背景中,因为它假设(与最近的皮层计算理论一致),部分可观测性和刺激历史问题是在感觉皮层中使用统计建模和推理来解决的,并且TD系统使用这种推理的结果而不是原始感官数据来预测奖励。它还解释了一系列实验数据,包括涉及编程时间变异性的实验和其他以前未建模的多巴胺能反应现象,我们认为这些现象与动物间隔计时中的主观噪声有关。最后,它提供了新的实验预测和一个丰富的理论框架,用于设计未来的实验。

相似文献

1
Representation and timing in theories of the dopamine system.多巴胺系统理论中的表征与时机
Neural Comput. 2006 Jul;18(7):1637-77. doi: 10.1162/neco.2006.18.7.1637.
2
Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.多巴胺系统模型中的刺激表征与奖励预测误差的时间安排。
Neural Comput. 2008 Dec;20(12):3034-54. doi: 10.1162/neco.2008.11-07-654.
3
Long-term reward prediction in TD models of the dopamine system.多巴胺系统TD模型中的长期奖励预测。
Neural Comput. 2002 Nov;14(11):2567-83. doi: 10.1162/089976602760407973.
4
The computational neurobiology of learning and reward.学习与奖励的计算神经生物学
Curr Opin Neurobiol. 2006 Apr;16(2):199-204. doi: 10.1016/j.conb.2006.03.006. Epub 2006 Mar 24.
5
The short-latency dopamine signal: a role in discovering novel actions?短潜伏期多巴胺信号:在发现新行为中起作用?
Nat Rev Neurosci. 2006 Dec;7(12):967-75. doi: 10.1038/nrn2022. Epub 2006 Nov 8.
6
Reward-dependent learning in neuronal networks for planning and decision making.用于规划和决策的神经网络中基于奖励的学习。
Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.
7
Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model.由内部模型再现的多巴胺神经元和皮层神经元的预期反应。
Exp Brain Res. 2001 Sep;140(2):234-40. doi: 10.1007/s002210100814.
8
Temporal difference model reproduces anticipatory neural activity.时间差异模型再现预期神经活动。
Neural Comput. 2001 Apr;13(4):841-62. doi: 10.1162/089976601300014376.
9
Dopamine responses comply with basic assumptions of formal learning theory.多巴胺反应符合形式学习理论的基本假设。
Nature. 2001 Jul 5;412(6842):43-8. doi: 10.1038/35083500.
10
Dopamine, prediction error and associative learning: a model-based account.多巴胺、预测误差与联想学习:基于模型的解释
Network. 2006 Mar;17(1):61-84. doi: 10.1080/09548980500361624.

引用本文的文献

1
Dopaminergic action prediction errors serve as a value-free teaching signal.多巴胺能动作预测误差作为一种无价值的教学信号。
Nature. 2025 May 14. doi: 10.1038/s41586-025-09008-9.
2
Prospective contingency explains behavior and dopamine signals during associative learning.前瞻性偶然性解释了联想学习过程中的行为和多巴胺信号。
Nat Neurosci. 2025 Mar 18. doi: 10.1038/s41593-025-01915-4.
3
Addressing Altered Anticipation as a Transdiagnostic Target Through Computational Psychiatry.通过计算精神病学将改变的预期作为跨诊断靶点来解决。
Biol Psychiatry Cogn Neurosci Neuroimaging. 2025 Mar 7. doi: 10.1016/j.bpsc.2025.02.014.
4
The devilish details affecting TDRL models in dopamine research.多巴胺研究中影响临时残疾评定量表(TDRL)模型的棘手细节。
Trends Cogn Sci. 2025 May;29(5):434-447. doi: 10.1016/j.tics.2025.02.001. Epub 2025 Feb 26.
5
Insights into the interaction between time and reward prediction on the activity of striatal tonically active neurons: A pilot study in rhesus monkeys.纹状体持续活动神经元活动中时间与奖励预测相互作用的研究:恒河猴的初步研究。
Physiol Rep. 2024 Sep;12(17):e70037. doi: 10.14814/phy2.70037.
6
Time-scale invariant contingency yields one-shot reinforcement learning despite extremely long delays to reinforcement.时间不变协变量尽管强化延迟非常长,但仍能产生单次强化学习。
Proc Natl Acad Sci U S A. 2024 Jul 23;121(30):e2405451121. doi: 10.1073/pnas.2405451121. Epub 2024 Jul 15.
7
The role of prospective contingency in the control of behavior and dopamine signals during associative learning.前瞻性偶然性在联想学习过程中对行为和多巴胺信号的控制作用。
bioRxiv. 2024 Feb 6:2024.02.05.578961. doi: 10.1101/2024.02.05.578961.
8
Expectancy-related changes in firing of dopamine neurons depend on hippocampus.多巴胺能神经元放电中与预期相关的变化取决于海马体。
bioRxiv. 2023 Jul 21:2023.07.19.549728. doi: 10.1101/2023.07.19.549728.
9
Emergence of belief-like representations through reinforcement learning.通过强化学习产生类信仰的表示。
PLoS Comput Biol. 2023 Sep 11;19(9):e1011067. doi: 10.1371/journal.pcbi.1011067. eCollection 2023 Sep.
10
Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model.腹侧被盖区的多巴胺预测误差反映了一个多线程的预测模型。
Nat Neurosci. 2023 May;26(5):830-839. doi: 10.1038/s41593-023-01310-x. Epub 2023 Apr 20.