Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland.
Nat Hum Behav. 2020 Oct;4(10):1053-1066. doi: 10.1038/s41562-020-0905-y. Epub 2020 Jul 6.
Distinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviours. Data from two-stage decision tasks have seemingly shown that human behaviour is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that have little, if any, simple model-free influence. We also demonstrate that behaviour in the two-stage task may falsely appear to be driven by a combination of simple model-free and model-based learning if purely model-based agents form inaccurate models of the task because of misconceptions. Furthermore, we report evidence that many participants do misconceive the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behaviour in the two-stage task and connections between reward learning, habit formation and compulsivity.
人们认为,独特的无模型和基于模型的学习过程驱动着典型和功能失调的行为。来自两阶段决策任务的数据似乎表明,人类行为是由这两个过程并行驱动的。然而,在这项研究中,我们表明,更详细的任务说明会导致参与者主要做出基于模型的选择,这些选择几乎没有(如果有的话)简单的无模型影响。我们还证明,如果由于误解,纯粹基于模型的代理对任务形成不准确的模型,那么两阶段任务中的行为可能会错误地看起来是由简单的无模型和基于模型的学习的组合驱动的。此外,我们报告的证据表明,许多参与者确实以重要的方式误解了任务。总的来说,我们认为人类会形成各种各样的学习模型。因此,简单的无模型与基于模型的学习二分法不足以解释两阶段任务以及奖励学习、习惯形成和强迫之间的联系。