• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多巴胺介导的皮质-纹状体回路中的学习与转换解释了强化学习中的行为变化。

Dopamine-mediated learning and switching in cortico-striatal circuit explain behavioral changes in reinforcement learning.

作者信息

Hong Simon, Hikosaka Okihide

机构信息

Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health Bethesda, MD, USA.

出版信息

Front Behav Neurosci. 2011 Mar 21;5:15. doi: 10.3389/fnbeh.2011.00015. eCollection 2011.

DOI:10.3389/fnbeh.2011.00015
PMID:21472026
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3065164/
Abstract

The basal ganglia are thought to play a crucial role in reinforcement learning. Central to the learning mechanism are dopamine (DA) D1 and D2 receptors located in the cortico-striatal synapses. However, it is still unclear how this DA-mediated synaptic plasticity is deployed and coordinated during reward-contingent behavioral changes. Here we propose a computational model of reinforcement learning that uses different thresholds of D1- and D2-mediated synaptic plasticity which are antagonized by DA-independent synaptic plasticity. A phasic increase in DA release caused by a larger-than-expected reward induces long-term potentiation (LTP) in the direct pathway, whereas a phasic decrease in DA release caused by a smaller-than-expected reward induces a cessation of long-term depression, leading to LTP in the indirect pathway. This learning mechanism can explain the robust behavioral adaptation observed in a location-reward-value-association task where the animal makes shorter latency saccades to reward locations. The changes in saccade latency become quicker as the monkey becomes more experienced. This behavior can be explained by a switching mechanism which activates the cortico-striatal circuit selectively. Our model also shows how D1- or D2-receptor blocking experiments affect selectively either reward or no-reward trials. The proposed mechanisms also explain the behavioral changes in Parkinson's disease.

摘要

基底神经节被认为在强化学习中起着关键作用。学习机制的核心是位于皮质 - 纹状体突触的多巴胺(DA)D1和D2受体。然而,在奖励相关的行为变化过程中,这种多巴胺介导的突触可塑性是如何部署和协调的,目前仍不清楚。在此,我们提出一种强化学习的计算模型,该模型使用由独立于多巴胺的突触可塑性拮抗的D1和D2介导的突触可塑性的不同阈值。由大于预期的奖励引起的多巴胺释放的阶段性增加会在直接通路中诱导长时程增强(LTP),而由小于预期的奖励引起的多巴胺释放的阶段性减少会导致长时延抑制的停止,从而在间接通路中诱导长时程增强。这种学习机制可以解释在位置 - 奖励 - 价值关联任务中观察到的强大行为适应性,在该任务中动物向奖励位置做出的扫视潜伏期更短。随着猴子经验的增加,扫视潜伏期的变化会变得更快。这种行为可以通过一种选择性激活皮质 - 纹状体回路的切换机制来解释。我们的模型还展示了D1或D2受体阻断实验如何选择性地影响奖励或无奖励试验。所提出的机制也解释了帕金森病中的行为变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/997141970a98/fnbeh-05-00015-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/c7f437a73480/fnbeh-05-00015-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/4f1569a0f855/fnbeh-05-00015-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/abe13d81fabf/fnbeh-05-00015-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/633e278c2d65/fnbeh-05-00015-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/10a6fca655e6/fnbeh-05-00015-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/99a26ab4f2b8/fnbeh-05-00015-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/997141970a98/fnbeh-05-00015-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/c7f437a73480/fnbeh-05-00015-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/4f1569a0f855/fnbeh-05-00015-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/abe13d81fabf/fnbeh-05-00015-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/633e278c2d65/fnbeh-05-00015-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/10a6fca655e6/fnbeh-05-00015-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/99a26ab4f2b8/fnbeh-05-00015-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10cd/3065164/997141970a98/fnbeh-05-00015-g007.jpg

相似文献

1
Dopamine-mediated learning and switching in cortico-striatal circuit explain behavioral changes in reinforcement learning.多巴胺介导的皮质-纹状体回路中的学习与转换解释了强化学习中的行为变化。
Front Behav Neurosci. 2011 Mar 21;5:15. doi: 10.3389/fnbeh.2011.00015. eCollection 2011.
2
Maladaptive striatal plasticity and abnormal reward-learning in cervical dystonia.颈源性肌张力障碍中的纹状体适应不良性可塑性和异常奖励学习。
Eur J Neurosci. 2019 Oct;50(7):3191-3204. doi: 10.1111/ejn.14414. Epub 2019 May 14.
3
A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine.皮质-基底神经节通路的双重作用假说:多巴胺和腺苷介导的对立和时间差分。
Front Neural Circuits. 2019 Jan 7;12:111. doi: 10.3389/fncir.2018.00111. eCollection 2018.
4
Opposing patterns of abnormal D1 and D2 receptor dependent cortico-striatal plasticity explain increased risk taking in patients with DYT1 dystonia.异常的 D1 和 D2 受体依赖的皮质纹状体可塑性的相反模式解释了 DYT1 型肌张力障碍患者冒险行为增加的原因。
PLoS One. 2020 May 4;15(5):e0226790. doi: 10.1371/journal.pone.0226790. eCollection 2020.
5
Distinct Functions of the Primate Putamen Direct and Indirect Pathways in Adaptive Outcome-Based Action Selection.灵长类动物壳核直接和间接通路在基于适应性结果的动作选择中的不同功能。
Front Neuroanat. 2017 Aug 3;11:66. doi: 10.3389/fnana.2017.00066. eCollection 2017.
6
Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits.纹状体多巴胺爬坡可能表明皮质基底神经节回路具有灵活的强化学习和遗忘能力。
Front Neural Circuits. 2014 Apr 9;8:36. doi: 10.3389/fncir.2014.00036. eCollection 2014.
7
Dopamine D1/D5, But not D2/D3, Receptor Dependency of Synaptic Plasticity at Hippocampal Mossy Fiber Synapses that Is Enabled by Patterned Afferent Stimulation, or Spatial Learning.海马苔藓纤维突触处突触可塑性的多巴胺D1/D5而非D2/D3受体依赖性,其由模式化传入刺激或空间学习所促成。
Front Synaptic Neurosci. 2016 Sep 23;8:31. doi: 10.3389/fnsyn.2016.00031. eCollection 2016.
8
Dopamine D1-like receptors and reward-related incentive learning.多巴胺 D1 样受体与奖赏相关的动机学习。
Neurosci Biobehav Rev. 1998 Mar;22(2):335-45. doi: 10.1016/s0149-7634(97)00019-5.
9
Striatal action-learning based on dopamine concentration.基于多巴胺浓度的纹状体动作学习。
Exp Brain Res. 2010 Jan;200(3-4):307-17. doi: 10.1007/s00221-009-2060-6. Epub 2009 Nov 11.
10
Distinct dopaminergic control of the direct and indirect pathways in reward-based and avoidance learning behaviors.基于奖励和回避学习行为中直接和间接通路的不同多巴胺能控制。
Neuroscience. 2014 Dec 12;282:49-59. doi: 10.1016/j.neuroscience.2014.04.026. Epub 2014 Apr 24.

引用本文的文献

1
Reward expectation and receipt differentially modulate the spiking of accumbens D1+ and D2+ neurons.奖励预期和奖励获得对伏隔核D1+和D2+神经元的放电有不同的调节作用。
Curr Biol. 2025 Mar 24;35(6):1285-1297.e3. doi: 10.1016/j.cub.2025.02.007. Epub 2025 Feb 27.
2
Implication of regional selectivity of dopamine deficits in impaired suppressing of involuntary movements in Parkinson's disease.帕金森病患者无意识运动抑制受损与多巴胺缺失的区域选择性有关。
Neurosci Biobehav Rev. 2024 Jul;162:105719. doi: 10.1016/j.neubiorev.2024.105719. Epub 2024 May 17.
3
Detecting Subtle Cognitive Impairment in Patients with Parkinson's Disease and Normal Cognition: A Novel Cognitive Control Challenge Task (C3T).

本文引用的文献

1
Reward prediction error coding in dorsal striatal neurons.背侧纹状体神经元中的奖励预测误差编码。
J Neurosci. 2010 Aug 25;30(34):11447-57. doi: 10.1523/JNEUROSCI.1719-10.2010.
2
Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry.通过光遗传学控制基底神经节回路调节帕金森运动行为。
Nature. 2010 Jul 29;466(7306):622-6. doi: 10.1038/nature09159. Epub 2010 Jul 7.
3
Rule-based categorization deficits in focal basal ganglia lesion and Parkinson's disease patients.基于规则的分类缺陷在局灶性基底节病变和帕金森病患者中。
检测帕金森病认知正常患者的轻度认知障碍:一种新型认知控制挑战任务(C3T)。
Brain Sci. 2023 Jun 16;13(6):961. doi: 10.3390/brainsci13060961.
4
Lateral habenula neurons signal step-by-step changes of reward prediction.外侧缰核神经元发出奖励预测的逐步变化信号。
iScience. 2022 Oct 27;25(11):105440. doi: 10.1016/j.isci.2022.105440. eCollection 2022 Nov 18.
5
A role for adaptive developmental plasticity in learning and decision making.适应性发育可塑性在学习和决策中的作用。
Curr Opin Behav Sci. 2020 Dec;36:48-54. doi: 10.1016/j.cobeha.2020.07.010. Epub 2020 Aug 23.
6
Response Systems, Antagonistic Responses, and the Behavioral Repertoire.反应系统、拮抗反应与行为 repertoire
Front Behav Neurosci. 2022 Jan 13;15:778420. doi: 10.3389/fnbeh.2021.778420. eCollection 2021.
7
Movement errors during skilled motor performance engage distinct prediction error mechanisms.运动技能表现中的运动错误会引发不同的预测误差机制。
Commun Biol. 2020 Dec 11;3(1):763. doi: 10.1038/s42003-020-01465-4.
8
A Neurofunctional Domains Approach to Evaluate D1/D5 Dopamine Receptor Partial Agonism on Cognition and Motivation in Healthy Volunteers With Low Working Memory Capacity.一种神经功能域方法,用于评估低工作记忆能力健康志愿者的认知和动机中的 D1/D5 多巴胺受体部分激动作用。
Int J Neuropsychopharmacol. 2020 May 27;23(5):287-299. doi: 10.1093/ijnp/pyaa007.
9
Learning the payoffs and costs of actions.学习行为的收益和成本。
PLoS Comput Biol. 2019 Feb 28;15(2):e1006285. doi: 10.1371/journal.pcbi.1006285. eCollection 2019 Feb.
10
Chronic treatment with galantamine rescues reversal learning in an attentional set-shifting test after experimental brain trauma.慢性给予加兰他敏可挽救实验性脑损伤后注意力定势转换测试中的逆转学习。
Exp Neurol. 2019 May;315:32-41. doi: 10.1016/j.expneurol.2019.01.019. Epub 2019 Jan 31.
Neuropsychologia. 2010 Aug;48(10):2974-86. doi: 10.1016/j.neuropsychologia.2010.06.006. Epub 2010 Jun 17.
4
A pallidus-habenula-dopamine pathway signals inferred stimulus values.苍白球缰核对多巴胺通路信号推断刺激值。
J Neurophysiol. 2010 Aug;104(2):1068-76. doi: 10.1152/jn.00158.2010. Epub 2010 Jun 10.
5
Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors.多巴胺介导的强迫行为中奖励偏差的机制。
Neuron. 2010 Jan 14;65(1):135-42. doi: 10.1016/j.neuron.2009.12.027.
6
Reward-associated gamma oscillations in ventral striatum are regionally differentiated and modulate local firing activity.腹侧纹状体中的与奖励相关的伽马振荡在区域上是分化的,并调节局部放电活动。
J Neurophysiol. 2010 Mar;103(3):1658-72. doi: 10.1152/jn.00432.2009. Epub 2010 Jan 20.
7
Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill.技能习得与巩固过程中纹状体回路的动态重组。
Nat Neurosci. 2009 Mar;12(3):333-41. doi: 10.1038/nn.2261. Epub 2009 Feb 8.
8
The globus pallidus sends reward-related signals to the lateral habenula.苍白球向外侧缰核发送与奖赏相关的信号。
Neuron. 2008 Nov 26;60(4):720-9. doi: 10.1016/j.neuron.2008.09.035.
9
Synaptic plasticity in the basal ganglia.基底神经节中的突触可塑性。
Behav Brain Res. 2009 Apr 12;199(1):119-28. doi: 10.1016/j.bbr.2008.10.030. Epub 2008 Nov 6.
10
Dichotomous dopaminergic control of striatal synaptic plasticity.纹状体突触可塑性的二分法多巴胺能控制
Science. 2008 Aug 8;321(5890):848-51. doi: 10.1126/science.1160575.