一种基于强化学习理论的奖励选择模型。

A model of reward choice based on the theory of reinforcement learning.

作者信息

Smirnitskaya I A, Frolov A A, Merzhanova G Kh

机构信息

Institute of Higher Nervous Activity and Neurophysiology, Russian Academy of Sciences, Moscow.

出版信息

Neurosci Behav Physiol. 2008 Mar;38(3):269-78. doi: 10.1007/s11055-008-0039-6.

DOI:10.1007/s11055-008-0039-6

PMID:18264774

Abstract

A model explaining behavioral "impulsivity" and "self-control" is proposed on the basis of the theory of reinforcement learning. The discount coefficient gamma, which in this theory accounts for the subjective reduction in the value of a delayed reinforcement, is identified with the overall level of dopaminergic neuron activity which, according to published data, also determines the behavioral variant. Computer modeling showed that high values of gamma are characteristic of predominantly "self-controlled" subjects, while smaller values of gamma are characteristic of "impulsive" subjects.

摘要

基于强化学习理论，提出了一个解释行为“冲动性”和“自我控制”的模型。在该理论中，折扣系数γ解释了延迟强化价值的主观降低，它与多巴胺能神经元活动的总体水平相关，根据已发表的数据，多巴胺能神经元活动的总体水平也决定了行为变体。计算机模拟显示，γ值较高是主要“自我控制”个体的特征，而γ值较小则是“冲动”个体的特征。