Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA.
Cogn Affect Behav Neurosci. 2023 Jun;23(3):600-619. doi: 10.3758/s13415-022-01059-z. Epub 2023 Feb 23.
Despite being unpredictable and uncertain, reward environments often exhibit certain regularities, and animals navigating these environments try to detect and utilize such regularities to adapt their behavior. However, successful learning requires that animals also adjust to uncertainty associated with those regularities. Here, we analyzed choice data from two comparable dynamic foraging tasks in mice and monkeys to investigate mechanisms underlying adjustments to different types of uncertainty. In these tasks, animals selected between two choice options that delivered reward probabilistically, while baseline reward probabilities changed after a variable number (block) of trials without any cues to the animals. To measure adjustments in behavior, we applied multiple metrics based on information theory that quantify consistency in behavior, and fit choice data using reinforcement learning models. We found that in both species, learning and choice were affected by uncertainty about reward outcomes (in terms of determining the better option) and by expectation about when the environment may change. However, these effects were mediated through different mechanisms. First, more uncertainty about the better option resulted in slower learning and forgetting in mice, whereas it had no significant effect in monkeys. Second, expectation of block switches accompanied slower learning, faster forgetting, and increased stochasticity in choice in mice, whereas it only reduced learning rates in monkeys. Overall, while demonstrating the usefulness of metrics based on information theory in examining adaptive behavior, our study provides evidence for multiple types of adjustments in learning and choice behavior according to uncertainty in the reward environment.
尽管奖励环境具有不可预测性和不确定性,但它们通常会表现出某些规律性,而动物在导航这些环境时会试图发现和利用这些规律性来适应自己的行为。然而,成功的学习要求动物也要适应与这些规律性相关的不确定性。在这里,我们分析了来自小鼠和猴子的两个类似的动态觅食任务的选择数据,以研究适应不同类型不确定性的机制。在这些任务中,动物在两个选择选项之间进行选择,这些选项以概率提供奖励,而基线奖励概率在没有任何动物线索的情况下,经过一定数量(块)的试验后发生变化。为了衡量行为的调整,我们应用了基于信息论的多种度量标准,这些标准量化了行为的一致性,并使用强化学习模型拟合选择数据。我们发现,在这两个物种中,学习和选择都受到奖励结果不确定性(确定更好选项)和环境可能何时变化的预期的影响。然而,这些影响是通过不同的机制介导的。首先,更好选项的不确定性增加导致小鼠学习和遗忘速度变慢,而在猴子中则没有显著影响。其次,对块切换的预期伴随着小鼠学习速度变慢、遗忘速度变快以及选择的随机性增加,而在猴子中,它仅降低了学习率。总的来说,虽然基于信息论的度量标准在检查适应性行为方面非常有用,但我们的研究提供了证据,表明根据奖励环境的不确定性,学习和选择行为会发生多种类型的调整。