Suppr超能文献

基于强化学习理论的奖励选择模型

[The model of the reward choice basing on the theory of reinforcement learning].

作者信息

Smirnitskaia I A, Frolov A A, Merzhanova G Kh

出版信息

Zh Vyssh Nerv Deiat Im I P Pavlova. 2007 Mar-Apr;57(2):133-43.

Abstract

We developed the model of alimentary instrumental conditioned bar-pressing reflex for cats making a choice between either immediate small reinforcement ("impulsive behavior") or delayed more valuable reinforcement ("self-control behavior"). Our model is based on the reinforcement learning theory. We emulated dopamine contribution by discount coefficient of this theory (a subjective decrease in the value of a delayed reinforcement). The results of computer simulation showed that "cats" with large discount coefficient demonstrated "self-control behavior"; small discount coefficient was associated with "impulsive behavior". This data are in agreement with the experimental data indicating that the impulsive behavior is due to a decreased amount of dopamine in striatum.

摘要

我们为猫建立了一种食物工具性条件性压杆反射模型,用于在即时小奖励(“冲动行为”)或延迟的更有价值奖励(“自我控制行为”)之间做出选择。我们的模型基于强化学习理论。我们通过该理论的折扣系数(延迟奖励价值的主观降低)来模拟多巴胺的作用。计算机模拟结果表明,具有大折扣系数的“猫”表现出“自我控制行为”;小折扣系数与“冲动行为”相关。这些数据与实验数据一致,表明冲动行为是由于纹状体中多巴胺量的减少。

相似文献

1
[The model of the reward choice basing on the theory of reinforcement learning].
Zh Vyssh Nerv Deiat Im I P Pavlova. 2007 Mar-Apr;57(2):133-43.
2
[Haloperidol does not alter a choice strategy for reinforcement value in cats].
Zh Vyssh Nerv Deiat Im I P Pavlova. 2006 May-Jun;56(3):392-400.
3
Reinforcement learning for discounted values often loses the goal in the application to animal learning.
Neural Netw. 2012 Nov;35:88-91. doi: 10.1016/j.neunet.2012.08.004. Epub 2012 Aug 24.
5
Deficient reinforcement learning in medial frontal cortex as a model of dopamine-related motivational deficits in ADHD.
Neural Netw. 2013 Oct;46:199-209. doi: 10.1016/j.neunet.2013.05.008. Epub 2013 May 21.
6
Neuroscience. Addiction as compulsive reward prediction.
Science. 2004 Dec 10;306(5703):1901-2. doi: 10.1126/science.1107071.
7
A model of reward choice based on the theory of reinforcement learning.
Neurosci Behav Physiol. 2008 Mar;38(3):269-78. doi: 10.1007/s11055-008-0039-6.
8
Dopamine-dependent reinforcement of motor skill learning: evidence from Gilles de la Tourette syndrome.
Brain. 2011 Aug;134(Pt 8):2287-301. doi: 10.1093/brain/awr147. Epub 2011 Jul 3.
9
[Reinforcement learning by striatum].
Brain Nerve. 2009 Apr;61(4):405-11.
10
Reward-dependent learning in neuronal networks for planning and decision making.
Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验