Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University, Providence, RI, USA.
Eur J Neurosci. 2012 Apr;35(7):1024-35. doi: 10.1111/j.1460-9568.2011.07980.x.
Instrumental learning involves corticostriatal circuitry and the dopaminergic system. This system is typically modeled in the reinforcement learning (RL) framework by incrementally accumulating reward values of states and actions. However, human learning also implicates prefrontal cortical mechanisms involved in higher level cognitive functions. The interaction of these systems remains poorly understood, and models of human behavior often ignore working memory (WM) and therefore incorrectly assign behavioral variance to the RL system. Here we designed a task that highlights the profound entanglement of these two processes, even in simple learning problems. By systematically varying the size of the learning problem and delay between stimulus repetitions, we separately extracted WM-specific effects of load and delay on learning. We propose a new computational model that accounts for the dynamic integration of RL and WM processes observed in subjects' behavior. Incorporating capacity-limited WM into the model allowed us to capture behavioral variance that could not be captured in a pure RL framework even if we (implausibly) allowed separate RL systems for each set size. The WM component also allowed for a more reasonable estimation of a single RL process. Finally, we report effects of two genetic polymorphisms having relative specificity for prefrontal and basal ganglia functions. Whereas the COMT gene coding for catechol-O-methyl transferase selectively influenced model estimates of WM capacity, the GPR6 gene coding for G-protein-coupled receptor 6 influenced the RL learning rate. Thus, this study allowed us to specify distinct influences of the high-level and low-level cognitive functions on instrumental learning, beyond the possibilities offered by simple RL models.
工具性学习涉及皮质纹状体回路和多巴胺能系统。该系统通常在强化学习 (RL) 框架中通过逐步积累状态和动作的奖励值来建模。然而,人类学习也涉及到参与更高层次认知功能的前额叶皮质机制。这些系统的相互作用仍然知之甚少,人类行为模型通常忽略工作记忆 (WM),因此错误地将行为方差分配给 RL 系统。在这里,我们设计了一项任务,突出了这两个过程的深刻纠缠,即使在简单的学习问题中也是如此。通过系统地改变学习问题的大小和刺激重复之间的延迟,我们分别提取了 WM 特定的负载和延迟对学习的影响。我们提出了一个新的计算模型,该模型解释了观察到的受试者行为中 RL 和 WM 过程的动态整合。将容量有限的 WM 纳入模型中,使我们能够捕捉到即使在纯 RL 框架中,如果我们(不合理地)允许为每个集合大小设置单独的 RL 系统,也无法捕捉到的行为方差。WM 组件还允许对单个 RL 过程进行更合理的估计。最后,我们报告了两种遗传多态性的影响,这些多态性对前额叶和基底神经节功能具有相对特异性。编码儿茶酚-O-甲基转移酶的 COMT 基因选择性地影响 WM 容量的模型估计,而编码 G 蛋白偶联受体 6 的 GPR6 基因影响 RL 学习率。因此,这项研究使我们能够指定高水平和低水平认知功能对工具性学习的独特影响,而不仅仅是简单 RL 模型提供的可能性。