对抗者学习（OpAL）：纹状体多巴胺对强化学习和选择动机的交互作用建模

Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive.

作者信息

Collins Anne G E, Frank Michael J

机构信息

Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University.

出版信息

Psychol Rev. 2014 Jul;121(3):337-66. doi: 10.1037/a0037015.

DOI:10.1037/a0037015

PMID:25090423

Abstract

The striatal dopaminergic system has been implicated in reinforcement learning (RL), motor performance, and incentive motivation. Various computational models have been proposed to account for each of these effects individually, but a formal analysis of their interactions is lacking. Here we present a novel algorithmic model expanding the classical actor-critic architecture to include fundamental interactive properties of neural circuit models, incorporating both incentive and learning effects into a single theoretical framework. The standard actor is replaced by a dual opponent actor system representing distinct striatal populations, which come to differentially specialize in discriminating positive and negative action values. Dopamine modulates the degree to which each actor component contributes to both learning and choice discriminations. In contrast to standard frameworks, this model simultaneously captures documented effects of dopamine on both learning and choice incentive-and their interactions-across a variety of studies, including probabilistic RL, effort-based choice, and motor skill learning.

摘要

纹状体多巴胺能系统与强化学习（RL）、运动表现和动机激励有关。已经提出了各种计算模型来分别解释这些效应中的每一种，但缺乏对它们相互作用的形式化分析。在这里，我们提出了一种新颖的算法模型，它扩展了经典的行动者-评论家架构，以纳入神经回路模型的基本交互特性，将激励和学习效应纳入一个单一的理论框架。标准的行动者被一个双对手行动者系统所取代，该系统代表不同的纹状体群体，它们逐渐在区分正向和负向行动价值方面表现出差异。多巴胺调节每个行动者组件对学习和选择区分的贡献程度。与标准框架不同，该模型同时捕捉了多巴胺在各种研究中对学习和选择激励及其相互作用的记录效应，包括概率强化学习、基于努力的选择和运动技能学习。