Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, VA 20147.
Proc Natl Acad Sci U S A. 2013 Oct 15;110(42):17154-9. doi: 10.1073/pnas.1310666110. Epub 2013 Sep 30.
Animals learn both whether and when a reward will occur. Neural models of timing posit that animals learn the mean time until reward perturbed by a fixed relative uncertainty. Nonetheless, animals can learn to perform actions for reward even in highly variable natural environments. Optimal inference in the presence of variable information requires probabilistic models, yet it is unclear whether animals can infer such models for reward timing. Here, we develop a behavioral paradigm in which optimal performance required knowledge of the distribution from which reward delays were chosen. We found that mice were able to accurately adjust their behavior to the SD of the reward delay distribution. Importantly, mice were able to flexibly adjust the amount of prior information used for inference according to the moment-by-moment demands of the task. The ability to infer probabilistic models for timing may allow mice to adapt to complex and dynamic natural environments.
动物既能学习到奖励是否会发生,也能学习到奖励发生的时间。时间推断的神经模型假设动物通过固定的相对不确定性来学习到奖励的平均时间。尽管如此,动物即使在高度变化的自然环境中也能学会为奖励而执行动作。在存在可变信息的情况下进行最优推断需要概率模型,但尚不清楚动物是否能够为奖励时间推断出这样的模型。在这里,我们开发了一种行为范式,其中最优表现需要对从奖励延迟中选择的分布的知识。我们发现,老鼠能够准确地调整自己的行为以适应奖励延迟分布的标准差。重要的是,老鼠能够根据任务的即时需求灵活地调整用于推断的先验信息的数量。推断时间概率模型的能力可能使老鼠能够适应复杂和动态的自然环境。