Department of Neurology.
Behav Neurosci. 2022 Oct;136(5):383-391. doi: 10.1037/bne0000516. Epub 2022 Apr 28.
Animals routinely learn to associate environmental stimuli and self-generated actions with their outcomes such as rewards. One of the most popular theoretical models of such learning is the reinforcement learning (RL) framework. The simplest form of RL, model-free RL, is widely applied to explain animal behavior in numerous neuroscientific studies. More complex RL versions assume that animals build and store an explicit model of the world in memory. To apply these approaches to explain animal behavior, typical neuroscientific RL models make implicit assumptions about how real animals represent the passage of time. In this perspective, I explicitly list these assumptions and show that they have several problematic implications. I hope that the explicit discussion of these problems encourages the field to seriously examine the assumptions underlying timing and reinforcement learning. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
动物通常会学习将环境刺激和自身产生的动作与它们的结果(例如奖励)联系起来。这种学习的最流行的理论模型之一是强化学习(RL)框架。RL 的最简单形式,无模型 RL,被广泛应用于解释众多神经科学研究中的动物行为。更复杂的 RL 版本假设动物在记忆中构建和存储对世界的显式模型。为了将这些方法应用于解释动物行为,典型的神经科学 RL 模型对真实动物如何表示时间的流逝做出了隐含的假设。在这个观点中,我明确列出了这些假设,并表明它们有几个有问题的含义。我希望对这些问题的明确讨论能鼓励该领域认真检查时间和强化学习的基本假设。(PsycInfo 数据库记录(c)2022 APA,保留所有权利)。