Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD.
Division of Neuroscience, Oregon National Primate Research Center, Beaverton 97006, OR.
J Neurosci. 2024 Jan 31;44(5):e1206232023. doi: 10.1523/JNEUROSCI.1206-23.2023.
Deciding whether to forego immediate rewards or explore new opportunities is a key component of flexible behavior and is critical for the survival of the species. Although previous studies have shown that different cortical and subcortical areas, including the amygdala and ventral striatum (VS), are implicated in representing the immediate (exploitative) and future (explorative) value of choices, the effect of the motor system used to make choices has not been examined. Here, we tested male rhesus macaques with amygdala or VS lesions on two versions of a three-arm bandit task where choices were registered with either a saccade or an arm movement. In both tasks we presented the monkeys with explore-exploit tradeoffs by periodically replacing familiar options with novel options that had unknown reward probabilities. We found that monkeys explored more with saccades but showed better learning with arm movements. VS lesions caused the monkeys to be more explorative with arm movements and less explorative with saccades, although this may have been due to an overall decrease in performance. VS lesions affected the monkeys' ability to learn novel stimulus-reward associations in both tasks, while after amygdala lesions this effect was stronger when choices were made with saccades. Further, on average, VS and amygdala lesions reduced the monkeys' ability to choose better options only when choices were made with a saccade. These results show that learning reward value associations to manage explore-exploit behaviors is motor system dependent and they further define the contributions of amygdala and VS to reinforcement learning.
决定是否放弃即时奖励或探索新机会是灵活行为的关键组成部分,对物种的生存至关重要。尽管先前的研究表明,包括杏仁核和腹侧纹状体(VS)在内的不同皮质和皮质下区域参与表示即时(剥削性)和未来(探索性)选择的价值,但用于做出选择的运动系统的影响尚未得到检验。在这里,我们测试了具有杏仁核或 VS 损伤的雄性恒河猴,在两个三臂赌博任务版本中进行测试,其中选择通过扫视或手臂运动进行记录。在这两个任务中,我们通过定期用具有未知奖励概率的新选项替换熟悉的选项来呈现猴子探索-利用权衡。我们发现猴子用扫视探索更多,但用手臂运动学习更好。VS 损伤导致猴子用手臂运动更具探索性,用扫视更具探索性,尽管这可能是由于整体表现下降所致。VS 损伤影响猴子在两个任务中学习新的刺激-奖励关联的能力,而在杏仁核损伤后,当用扫视做出选择时,这种影响更强。此外,平均而言,只有当用扫视做出选择时,VS 和杏仁核损伤才会降低猴子选择更好选项的能力。这些结果表明,学习奖励价值关联以管理探索-利用行为取决于运动系统,它们进一步定义了杏仁核和 VS 对强化学习的贡献。