Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut 06511.
Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, United Kingdom.
J Neurosci. 2023 Jan 18;43(3):458-471. doi: 10.1523/JNEUROSCI.1113-22.2022. Epub 2022 Oct 10.
Model-free and model-based computations are argued to distinctly update action values that guide decision-making processes. It is not known, however, if these model-free and model-based reinforcement learning mechanisms recruited in operationally based instrumental tasks parallel those engaged by pavlovian-based behavioral procedures. Recently, computational work has suggested that individual differences in the attribution of incentive salience to reward predictive cues, that is, sign- and goal-tracking behaviors, are also governed by variations in model-free and model-based value representations that guide behavior. Moreover, it is not appreciated if these systems that are characterized computationally using model-free and model-based algorithms are conserved across tasks for individual animals. In the current study, we used a within-subject design to assess sign-tracking and goal-tracking behaviors using a pavlovian conditioned approach task and then characterized behavior using an instrumental multistage decision-making (MSDM) task in male rats. We hypothesized that both pavlovian and instrumental learning processes may be driven by common reinforcement-learning mechanisms. Our data confirm that sign-tracking behavior was associated with greater reward-mediated, model-free reinforcement learning and that it was also linked to model-free reinforcement learning in the MSDM task. Computational analyses revealed that pavlovian model-free updating was correlated with model-free reinforcement learning in the MSDM task. These data provide key insights into the computational mechanisms mediating associative learning that could have important implications for normal and abnormal states. Model-free and model-based computations that guide instrumental decision-making processes may also be recruited in pavlovian-based behavioral procedures. Here, we used a within-subject design to test the hypothesis that both pavlovian and instrumental learning processes were driven by common reinforcement-learning mechanisms. Sign-tracking and goal-tracking behaviors were assessed in rats using a pavlovian conditioned approach task, and then instrumental behavior was characterized using an MSDM task. We report that sign-tracking behavior was associated with greater model-free, but not model-based, learning in the MSDM task. These data suggest that pavlovian and instrumental behaviors may be driven by conserved reinforcement-learning mechanisms.
无模型和基于模型的计算被认为可以分别更新指导决策过程的动作值。然而,尚不清楚在基于操作的工具任务中招募的这些无模型和基于模型的强化学习机制是否与基于巴甫洛夫的行为程序中招募的机制平行。最近,计算工作表明,对奖励预测线索(即符号和目标跟踪行为)赋予激励显著性的个体差异也受指导行为的无模型和基于模型的价值表示的变化所支配。此外,尚不清楚用于个体动物的计算的这些系统是否在任务之间保持一致。在当前的研究中,我们使用了一种被试内设计,使用巴甫洛夫条件接近任务来评估符号跟踪和目标跟踪行为,然后使用多阶段决策(MSDM)任务来描述行为。我们假设,无论是巴甫洛夫式的学习还是工具式的学习过程,都可能是由共同的强化学习机制驱动的。我们的数据证实,符号跟踪行为与更大的奖励介导的、无模型的强化学习有关,并且它也与 MSDM 任务中的无模型强化学习有关。计算分析表明,巴甫洛夫式的无模型更新与 MSDM 任务中的无模型强化学习相关。这些数据为介导联想学习的计算机制提供了重要的见解,这可能对正常和异常状态具有重要意义。指导工具决策过程的无模型和基于模型的计算也可能被用于基于巴甫洛夫的行为程序。在这里,我们使用被试内设计来检验以下假设:即巴甫洛夫式学习和工具式学习过程都由共同的强化学习机制驱动。我们使用巴甫洛夫条件接近任务来评估大鼠的符号跟踪和目标跟踪行为,然后使用 MSDM 任务来描述工具行为。我们报告说,符号跟踪行为与 MSDM 任务中的更大的无模型但不是基于模型的学习有关。这些数据表明,巴甫洛夫式和工具式行为可能由保守的强化学习机制驱动。