Campbell Ethan M, Zhong Wanting, Hogeveen Jeremy, Grafman Jordan
Department of Psychology, University of New Mexico, Albuquerque, New Mexico 87131
Clinical Neuroscience Center, University of New Mexico, Albuquerque, New Mexico 87131.
J Neurosci. 2025 Apr 9;45(15):e0422242025. doi: 10.1523/JNEUROSCI.0422-24.2025.
Probabilistic reinforcement learning (RL) tasks assay how individuals make decisions under uncertainty. The use of internal models (model-based) or direct learning from experiences (model-free), and the degree of choice stochasticity across alternatives (i.e., exploration), can all be influenced by the state space of the decision-making task. There is considerable individual variation in the balance between model-based and model-free control during decision-making, and this balance is affected by incentive motivation. The effect of variable reward incentives on the arbitration between model-based and model-free learning remains understudied, and individual differences in neural signatures and cognitive traits that moderate the effect of reward on model-free/model-based control are unknown. Here we combined a two-stage decision-making task utilizing differing reward incentives with computational modeling, neuropsychological tests, and neuroimaging to address these questions. Results showed the prospect of greater reward decreased exploration of alternative options and increased the balance toward model-based learning. These behavioral effects were replicated across two independent datasets including both sexes. Individual differences in processing speed and analytical thinking style affected how reward altered the dependence on both systems. Using a systems neuroscience-inspired approach to resting-state functional connectivity, we found reduced exploration of the options during the first stage of our task under high relative to low incentives was predicted by increased cross-network coupling between ventral and dorsal RL circuitry. These findings suggest that integrity of functional connections between stimulus valuation (ventral) and action valuation (dorsal) RL networks is associated with changes in the balance between explore-exploit decisions under changing reward incentives.
概率强化学习(RL)任务用于分析个体在不确定性下如何做出决策。内部模型的使用(基于模型)或从经验中直接学习(无模型),以及跨选项的选择随机性程度(即探索),都可能受到决策任务状态空间的影响。在决策过程中,基于模型和无模型控制之间的平衡存在相当大的个体差异,并且这种平衡会受到激励动机的影响。可变奖励激励对基于模型和无模型学习之间仲裁的影响仍未得到充分研究,调节奖励对无模型/基于模型控制影响的神经特征和认知特质的个体差异也尚不清楚。在这里,我们将一个利用不同奖励激励的两阶段决策任务与计算建模、神经心理学测试和神经影像学相结合,以解决这些问题。结果表明,更高奖励的前景减少了对替代选项的探索,并增加了向基于模型学习的平衡。这些行为效应在包括男女的两个独立数据集中得到了重复。处理速度和分析思维方式的个体差异影响了奖励如何改变对这两个系统的依赖。使用一种受系统神经科学启发的方法来研究静息态功能连接,我们发现,相对于低激励,在高激励下我们任务的第一阶段对选项的探索减少是由腹侧和背侧RL回路之间跨网络耦合的增加所预测的。这些发现表明,刺激评估(腹侧)和行动评估(背侧)RL网络之间功能连接的完整性与在不断变化的奖励激励下探索 - 利用决策之间平衡的变化相关。