Suppr
超能文献

多巴胺能对动机和强化学习的控制：一种针对奖励导向行为的闭环解释。

Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior.

机构信息

Physical and Health Education, Graduate School of Education, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan.

出版信息

J Neurosci. 2013 May 15;33(20):8866-90. doi: 10.1523/JNEUROSCI.4614-12.2013.

DOI:10.1523/JNEUROSCI.4614-12.2013

PMID:23678129

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6618820/

Abstract

Humans and animals take actions quickly when they expect that the actions lead to reward, reflecting their motivation. Injection of dopamine receptor antagonists into the striatum has been shown to slow such reward-seeking behavior, suggesting that dopamine is involved in the control of motivational processes. Meanwhile, neurophysiological studies have revealed that phasic response of dopamine neurons appears to represent reward prediction error, indicating that dopamine plays central roles in reinforcement learning. However, previous attempts to elucidate the mechanisms of these dopaminergic controls have not fully explained how the motivational and learning aspects are related and whether they can be understood by the way the activity of dopamine neurons itself is controlled by their upstream circuitries. To address this issue, we constructed a closed-circuit model of the corticobasal ganglia system based on recent findings regarding intracortical and corticostriatal circuit architectures. Simulations show that the model could reproduce the observed distinct motivational effects of D1- and D2-type dopamine receptor antagonists. Simultaneously, our model successfully explains the dopaminergic representation of reward prediction error as observed in behaving animals during learning tasks and could also explain distinct choice biases induced by optogenetic stimulation of the D1 and D2 receptor-expressing striatal neurons. These results indicate that the suggested roles of dopamine in motivational control and reinforcement learning can be understood in a unified manner through a notion that the indirect pathway of the basal ganglia represents the value of states/actions at a previous time point, an empirically driven key assumption of our model.

摘要

当人类和动物预期行动会带来奖励时，它们会迅速采取行动，这反映了它们的动机。向纹状体中注射多巴胺受体拮抗剂已被证明可以减缓这种寻求奖励的行为，表明多巴胺参与了动机过程的控制。同时，神经生理学研究表明，多巴胺神经元的相位反应似乎代表了奖励预测误差，表明多巴胺在强化学习中起着核心作用。然而，以前试图阐明这些多巴胺能控制机制的尝试并没有完全解释动机和学习方面是如何相关的，以及它们是否可以通过多巴胺神经元自身的活动由其上游回路控制的方式来理解。为了解决这个问题，我们根据最近关于皮质内和皮质纹状体回路结构的发现，构建了一个皮质基底节系统的闭环模型。模拟表明，该模型可以重现观察到的 D1 型和 D2 型多巴胺受体拮抗剂的不同动机效应。同时，我们的模型成功地解释了在学习任务中观察到的多巴胺对奖励预测误差的表示，并且还可以解释光遗传学刺激表达 D1 和 D2 受体的纹状体神经元引起的不同选择偏好。这些结果表明，通过基底神经节间接通路代表先前时间点状态/动作的值的概念，可以以统一的方式理解多巴胺在动机控制和强化学习中的作用，这是我们模型的一个经验驱动的关键假设。

相似文献

Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior.

J Neurosci. 2013 May 15;33(20):8866-90. doi: 10.1523/JNEUROSCI.4614-12.2013.

Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning.

Eur J Neurosci. 2015 Aug;42(4):2003-21. doi: 10.1111/ejn.12994. Epub 2015 Jul 25.

Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation.

PLoS Comput Biol. 2016 Oct 13;12(10):e1005145. doi: 10.1371/journal.pcbi.1005145. eCollection 2016 Oct.

A Neural Circuit Mechanism for the Involvements of Dopamine in Effort-Related Choices: Decay of Learned Values, Secondary Effects of Depletion, and Calculation of Temporal Difference Error.

eNeuro. 2018 Feb 21;5(1). doi: 10.1523/ENEURO.0021-18.2018. eCollection 2018 Jan-Feb.

Dopaminergic modulation of the striatal microcircuit: receptor-specific configuration of cell assemblies.

J Neurosci. 2011 Oct 19;31(42):14972-83. doi: 10.1523/JNEUROSCI.3226-11.2011.

Modeling the effects of motivation on choice and learning in the basal ganglia.

PLoS Comput Biol. 2020 May 26;16(5):e1007465. doi: 10.1371/journal.pcbi.1007465. eCollection 2020 May.

Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits.

Front Neural Circuits. 2014 Apr 9;8:36. doi: 10.3389/fncir.2014.00036. eCollection 2014.

A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine.

Front Neural Circuits. 2019 Jan 7;12:111. doi: 10.3389/fncir.2018.00111. eCollection 2018.

Dynamic Nigrostriatal Dopamine Biases Action Selection.

Neuron. 2017 Mar 22;93(6):1436-1450.e8. doi: 10.1016/j.neuron.2017.02.029. Epub 2017 Mar 9.

Basal ganglia circuit loops, dopamine and motivation: A review and enquiry.

Behav Brain Res. 2015 Sep 1;290:17-31. doi: 10.1016/j.bbr.2015.04.018. Epub 2015 Apr 20.

引用本文的文献

Striatal Gradient in Value-Decay Explains Regional Differences in Dopamine Patterns and Reinforcement Learning Computations.

J Neurosci. 2025 Jul 18. doi: 10.1523/JNEUROSCI.0170-25.2025.

Human Substantia Nigra Neurons Encode Reward Expectations.

bioRxiv. 2024 May 11:2024.05.10.593406. doi: 10.1101/2024.05.10.593406.

The effect of mother-infant group music classes on postnatal depression-A systematic review protocol.

PLoS One. 2022 Oct 6;17(10):e0273669. doi: 10.1371/journal.pone.0273669. eCollection 2022.

Phasic Dopamine Changes and Hebbian Mechanisms during Probabilistic Reversal Learning in Striatal Circuits: A Computational Study.

Int J Mol Sci. 2022 Mar 22;23(7):3452. doi: 10.3390/ijms23073452.

Development of an MRI-Compatible Nasal Drug Delivery Method for Probing Nicotine Addiction Dynamics.

Pharmaceutics. 2021 Dec 3;13(12):2069. doi: 10.3390/pharmaceutics13122069.

Modulation of Dopamine for Adaptive Learning: A Neurocomputational Model.

Comput Brain Behav. 2021 Mar;4(1):34-52. doi: 10.1007/s42113-020-00083-x. Epub 2020 Jun 12.

The Abuse Potential of Novel Synthetic Phencyclidine Derivative 1-(1-(4-Fluorophenyl)Cyclohexyl)Piperidine (4'-F-PCP) in Rodents.

Int J Mol Sci. 2020 Jun 29;21(13):4631. doi: 10.3390/ijms21134631.

Self-Regulation of the Fusiform Face Area in Autism Spectrum: A Feasibility Study With Real-Time fMRI Neurofeedback.

Front Hum Neurosci. 2019 Dec 20;13:446. doi: 10.3389/fnhum.2019.00446. eCollection 2019.

Minimal Circuit Model of Reward Prediction Error Computations and Effects of Nicotinic Modulations.

Front Neural Circuits. 2019 Jan 8;12:116. doi: 10.3389/fncir.2018.00116. eCollection 2018.

A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine.

Front Neural Circuits. 2019 Jan 7;12:111. doi: 10.3389/fncir.2018.00111. eCollection 2018.

本文引用的文献

Multiple layer 5 pyramidal cell subtypes relay cortical feedback from secondary to primary motor areas in rats.

Cereb Cortex. 2014 Sep;24(9):2362-76. doi: 10.1093/cercor/bht088. Epub 2013 Apr 3.

Control of layer 5 pyramidal cell spiking by oscillatory inhibition in the distal apical dendrites: a computational modeling study.

J Neurophysiol. 2013 Jun;109(11):2739-56. doi: 10.1152/jn.00397.2012. Epub 2013 Mar 13.

Input-specific control of reward and aversion in the ventral tegmental area.

Nature. 2012 Nov 8;491(7423):212-7. doi: 10.1038/nature11527. Epub 2012 Oct 14.

A new control center for dopaminergic systems: pulling the VTA by the tail.

Trends Neurosci. 2012 Nov;35(11):681-90. doi: 10.1016/j.tins.2012.06.007. Epub 2012 Jul 21.

Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons.

Neuron. 2012 Jul 12;75(1):58-64. doi: 10.1016/j.neuron.2012.04.038.

Strain-specific regulation of striatal phenotype in Drd2-eGFP BAC transgenic mice.

J Neurosci. 2012 Jul 4;32(27):9124-32. doi: 10.1523/JNEUROSCI.0229-12.2012.

A comparison of striatal-dependent behaviors in wild-type and hemizygous Drd1a and Drd2 BAC transgenic mice.

J Neurosci. 2012 Jul 4;32(27):9119-23. doi: 10.1523/JNEUROSCI.0224-12.2012.

Whole-brain mapping of direct inputs to midbrain dopamine neurons.

Neuron. 2012 Jun 7;74(5):858-73. doi: 10.1016/j.neuron.2012.03.017.

Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways.

Trends Neurosci. 2012 Aug;35(8):457-67. doi: 10.1016/j.tins.2012.04.009. Epub 2012 May 30.

Reward and punishment illuminated.

Nat Neurosci. 2012 May 25;15(6):807-9. doi: 10.1038/nn.3122.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

多巴胺能对动机和强化学习的控制：一种针对奖励导向行为的闭环解释。

Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译