Interdepartmental Neuroscience Program, Yale University School of Medicine, New Haven, CT 06511.
Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06511.
eNeuro. 2022 Mar 11;9(2). doi: 10.1523/ENEURO.0457-21.2022. Print 2022 Mar-Apr.
In a competitive game involving an animal and an opponent, the outcome is contingent on the choices of both players. To succeed, the animal must continually adapt to competitive pressure, or else risk being exploited and lose out on rewards. In this study, we demonstrate that head-fixed male mice can be trained to play the iterative competitive game "matching pennies" against a virtual computer opponent. We find that the animals' performance is well described by a hybrid computational model that includes Q-learning and choice kernels. Comparing between matching pennies and a non-competitive two-armed bandit task, we show that the tasks encourage animals to operate at different regimes of reinforcement learning. To understand the involvement of neuromodulatory mechanisms, we measure fluctuations in pupil size and use multiple linear regression to relate the trial-by-trial transient pupil responses to decision-related variables. The analysis reveals that pupil responses are modulated by observable variables, including choice and outcome, as well as latent variables for value updating, but not action selection. Collectively, these results establish a paradigm for studying competitive decision-making in head-fixed mice and provide insights into the role of arousal-linked neuromodulation in the decision process.
在涉及动物和对手的竞争游戏中,结果取决于双方的选择。为了成功,动物必须不断适应竞争压力,否则就有被利用和失去奖励的风险。在这项研究中,我们证明了可以训练头部固定的雄性老鼠与虚拟计算机对手玩迭代竞争游戏“猜正反面”。我们发现,动物的表现可以很好地用一个混合的计算模型来描述,该模型包括 Q 学习和选择核。通过比较猜正反面和非竞争的双臂赌博任务,我们表明这些任务鼓励动物在不同的强化学习机制下运作。为了了解神经调制机制的参与,我们测量了瞳孔大小的波动,并使用多元线性回归将逐次的瞬态瞳孔反应与决策相关变量联系起来。分析表明,瞳孔反应受到可观察变量的调节,包括选择和结果,以及价值更新的潜在变量,但不包括动作选择。总的来说,这些结果为在头部固定的老鼠中研究竞争决策建立了一个范例,并提供了对唤醒相关神经调制在决策过程中的作用的深入了解。