Department of Psychology, New York University, New York, New York 10003, USA.
Center for Neural Science, New York University, New York, New York 10003, USA.
Learn Mem. 2024 Mar 25;31(3). doi: 10.1101/lm.053901.123. Print 2024 Mar.
From early in life, we encounter both controllable environments, in which our actions can causally influence the reward outcomes we experience, and uncontrollable environments, in which they cannot. Environmental controllability is theoretically proposed to organize our behavior. In controllable contexts, we can learn to proactively select instrumental actions that bring about desired outcomes. In uncontrollable environments, Pavlovian learning enables hard-wired, reflexive reactions to anticipated, motivationally salient events, providing "default" behavioral responses. Previous studies characterizing the balance between Pavlovian and instrumental learning systems across development have yielded divergent findings, with some studies observing heightened expression of Pavlovian learning during adolescence and others observing a reduced influence of Pavlovian learning during this developmental stage. In this study, we aimed to investigate whether a theoretical model of controllability-dependent arbitration between learning systems might explain these seemingly divergent findings in the developmental literature, with the specific hypothesis that adolescents' action selection might be particularly sensitive to environmental controllability. To test this hypothesis, 90 participants, aged 8-27, performed a probabilistic-learning task that enables estimation of Pavlovian influence on instrumental learning, across both controllable and uncontrollable conditions. We fit participants' data with a reinforcement-learning model in which controllability inferences adaptively modulate the dominance of Pavlovian versus instrumental control. Relative to children and adults, adolescents exhibited greater flexibility in calibrating the expression of Pavlovian bias to the degree of environmental controllability. These findings suggest that sensitivity to environmental reward statistics that organize motivated behavior may be heightened during adolescence.
从生命早期开始,我们就会遇到两种环境:可控环境,在这种环境中,我们的行为可以因果地影响我们所经历的奖励结果;不可控环境,在这种环境中,我们的行为不能影响奖励结果。环境可控性在理论上被提出是为了组织我们的行为。在可控环境中,我们可以学会主动选择带来期望结果的工具性行为。在不可控环境中,巴甫洛夫学习使我们能够对预期的、动机显著的事件产生硬性的、反射性的反应,提供“默认”的行为反应。之前的研究描述了发展过程中两种学习系统(即巴甫洛夫学习和工具性学习)之间的平衡,得出了不同的发现,一些研究观察到青春期巴甫洛夫学习的表达增强,而另一些研究观察到在这个发育阶段,巴甫洛夫学习的影响降低。在这项研究中,我们旨在研究一种基于可控性的学习系统仲裁的理论模型是否可以解释发展文献中的这些看似矛盾的发现,具体假设是青少年的行为选择可能对环境可控性特别敏感。为了检验这一假设,90 名年龄在 8 到 27 岁之间的参与者在可控和不可控条件下进行了一项概率学习任务,该任务能够估计巴甫洛夫对工具性学习的影响。我们使用强化学习模型来拟合参与者的数据,该模型中,环境可控性推断自适应地调节了巴甫洛夫控制与工具性控制的优势。与儿童和成人相比,青少年在根据环境可控性的程度来调整巴甫洛夫偏差的表达方面表现出更大的灵活性。这些发现表明,在青春期,对组织动机行为的环境奖励统计数据的敏感性可能会增强。