Smirnitskaia I A, Frolov A A, Merzhanova G Kh
Zh Vyssh Nerv Deiat Im I P Pavlova. 2007 Mar-Apr;57(2):133-43.
We developed the model of alimentary instrumental conditioned bar-pressing reflex for cats making a choice between either immediate small reinforcement ("impulsive behavior") or delayed more valuable reinforcement ("self-control behavior"). Our model is based on the reinforcement learning theory. We emulated dopamine contribution by discount coefficient of this theory (a subjective decrease in the value of a delayed reinforcement). The results of computer simulation showed that "cats" with large discount coefficient demonstrated "self-control behavior"; small discount coefficient was associated with "impulsive behavior". This data are in agreement with the experimental data indicating that the impulsive behavior is due to a decreased amount of dopamine in striatum.
我们为猫建立了一种食物工具性条件性压杆反射模型,用于在即时小奖励(“冲动行为”)或延迟的更有价值奖励(“自我控制行为”)之间做出选择。我们的模型基于强化学习理论。我们通过该理论的折扣系数(延迟奖励价值的主观降低)来模拟多巴胺的作用。计算机模拟结果表明,具有大折扣系数的“猫”表现出“自我控制行为”;小折扣系数与“冲动行为”相关。这些数据与实验数据一致,表明冲动行为是由于纹状体中多巴胺量的减少。