一种用于学习趋近行为的多巴胺能神经元预测强化模型。

A predictive reinforcement model of dopamine neurons for learning approach behavior.

作者信息

Contreras-Vidal J L, Schultz W

机构信息

Motor Control Laboratory, Arizona State University, Tempe 85287-0404, USA.

出版信息

J Comput Neurosci. 1999 May-Jun;6(3):191-214. doi: 10.1023/a:1008862904946.

DOI:10.1023/a:1008862904946

PMID:10406133

Abstract

A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.

摘要

提出了一种神经网络模型，该模型阐述了在与接近行为相关的奖励学习过程中，多巴胺和前额叶皮层活动如何在皮质-纹状体回路中引导短期和长期信息处理。该模型预测了学习过程中产生的两种与奖励相关的神经元反应：（1）细胞活动，其在预期奖励交付时间的预测中发出错误信号；（2）神经激活，其编码奖励数量、奖励类型或刺激预期预测中的错误。前一种信号类型与多巴胺能神经元的反应一致，而后一种信号与前额叶皮层中报告的奖励预期反应一致。结果表明，一种满足Carpenter和Grossberg（1987）自适应共振理论设计原则的神经网络架构，可以解释多巴胺对新奇事物、泛化以及对食欲和厌恶刺激的辨别所产生的反应。通过对该模型的模拟，针对任务外免费食物的发放、食欲和厌恶刺激的定时条件发放以及不对称的、有指导的延迟反应任务，对这些假设进行了仔细研究。

相似文献

A predictive reinforcement model of dopamine neurons for learning approach behavior.

J Comput Neurosci. 1999 May-Jun;6(3):191-214. doi: 10.1023/a:1008862904946.

Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior.

Prog Brain Res. 2000;126:193-215. doi: 10.1016/S0079-6123(00)26015-9.

A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.

Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.

Reward-dependent learning in neuronal networks for planning and decision making.

Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.

Predictive reward signal of dopamine neurons.

J Neurophysiol. 1998 Jul;80(1):1-27. doi: 10.1152/jn.1998.80.1.1.

Modeling functions of striatal dopamine modulation in learning and planning.

Neuroscience. 2001;103(1):65-85. doi: 10.1016/s0306-4522(00)00554-6.

Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning.

Eur J Neurosci. 2015 Aug;42(4):2003-21. doi: 10.1111/ejn.12994. Epub 2015 Jul 25.

Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials.

J Neurosci. 2008 Nov 5;28(45):11673-84. doi: 10.1523/JNEUROSCI.3839-08.2008.

Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli.

Nature. 1996 Feb 1;379(6564):449-51. doi: 10.1038/379449a0.

Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task.

J Neurosci. 1993 Mar;13(3):900-13. doi: 10.1523/JNEUROSCI.13-03-00900.1993.

引用本文的文献

Dynamics of striatal action selection and reinforcement learning.

Elife. 2025 May 8;13:RP101747. doi: 10.7554/eLife.101747.

Human Substantia Nigra Neurons Encode Reward Expectations.

bioRxiv. 2024 May 11:2024.05.10.593406. doi: 10.1101/2024.05.10.593406.

Dynamics of striatal action selection and reinforcement learning.

bioRxiv. 2024 Dec 24:2024.02.14.580408. doi: 10.1101/2024.02.14.580408.

Blunted Expected Reward Value Signals in Binge Alcohol Drinkers.

J Neurosci. 2023 Aug 2;43(31):5685-5692. doi: 10.1523/JNEUROSCI.2157-21.2022. Epub 2023 Jan 30.

Modulation of Dopamine for Adaptive Learning: A Neurocomputational Model.

Comput Brain Behav. 2021 Mar;4(1):34-52. doi: 10.1007/s42113-020-00083-x. Epub 2020 Jun 12.

A systems-neuroscience model of phasic dopamine.

Psychol Rev. 2020 Nov;127(6):972-1021. doi: 10.1037/rev0000199. Epub 2020 Jun 11.

Therapeutic Challenges of Post-traumatic Stress Disorder: Focus on the Dopaminergic System.

Front Pharmacol. 2019 Apr 17;10:404. doi: 10.3389/fphar.2019.00404. eCollection 2019.

Synchronicity: The Role of Midbrain Dopamine in Whole-Brain Coordination.

eNeuro. 2019 May 3;6(2). doi: 10.1523/ENEURO.0345-18.2019. Print 2019 Mar/Apr.

Predominant Striatal Input to the Lateral Habenula in Macaques Comes from Striosomes.

Curr Biol. 2019 Jan 7;29(1):51-61.e5. doi: 10.1016/j.cub.2018.11.008. Epub 2018 Dec 13.

Striatopallidal neurons control avoidance behavior in exploratory tasks.

Mol Psychiatry. 2020 Feb;25(2):491-505. doi: 10.1038/s41380-018-0051-3. Epub 2018 Apr 25.

本文引用的文献

ART 2: self-organization of stable category recognition codes for analog input patterns.

Appl Opt. 1987 Dec 1;26(23):4919-30. doi: 10.1364/AO.26.004919.

Topographic organization of the ventral striatal efferent projections in the rhesus monkey: an anterograde tracing study.

J Comp Neurol. 1990 Mar 8;293(2):282-98. doi: 10.1002/cne.902930210.

A neural network model of adaptively timed reinforcement learning and hippocampal dynamics.

Brain Res Cogn Brain Res. 1992 Jun;1(1):3-38. doi: 10.1016/0926-6410(92)90003-a.

A quantitative description of membrane current and its application to conduction and excitation in nerve.

J Physiol. 1952 Aug;117(4):500-44. doi: 10.1113/jphysiol.1952.sp004764.

Distributed Learning, Recognition, and Prediction by ART and ARTMAP Neural Networks.

Neural Netw. 1997 Nov;10(8):1473-1494. doi: 10.1016/s0893-6080(97)00004-x.

Stimulation of the prefrontal cortex in the rat induces patterns of activity in midbrain dopaminergic neurons which resemble natural burst events.

Synapse. 1996 Mar;22(3):195-208. doi: 10.1002/(SICI)1098-2396(199603)22:3<195::AID-SYN1>3.0.CO;2-7.

A neural substrate of prediction and reward.

Science. 1997 Mar 14;275(5306):1593-9. doi: 10.1126/science.275.5306.1593.

Numerical bifurcation analysis of distance-dependent on-center off-surround shunting neural networks.

Biol Cybern. 1996 Dec;75(6):495-507. doi: 10.1007/s004220050314.

The representation of temporal information in perception and motor control.

Curr Opin Neurobiol. 1996 Dec;6(6):851-7. doi: 10.1016/s0959-4388(96)80037-7.

Functional organization of thalamocortical relays.

J Neurophysiol. 1996 Sep;76(3):1367-95. doi: 10.1152/jn.1996.76.3.1367.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于学习趋近行为的多巴胺能神经元预测强化模型。

A predictive reinforcement model of dopamine neurons for learning approach behavior.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献