基于模型和无模型的巴甫洛夫奖励学习：重新评估、修正与揭示。

Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation.

作者信息

Dayan Peter, Berridge Kent C

机构信息

Gatsby Computational Neuroscience Unit, University College London, London, UK,

出版信息

Cogn Affect Behav Neurosci. 2014 Jun;14(2):473-92. doi: 10.3758/s13415-014-0277-8.

DOI:10.3758/s13415-014-0277-8

PMID:24647659

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4074442/

Abstract

Evidence supports at least two methods for learning about reward and punishment and making predictions for guiding actions. One method, called model-free, progressively acquires cached estimates of the long-run values of circumstances and actions from retrospective experience. The other method, called model-based, uses representations of the environment, expectations, and prospective calculations to make cognitive predictions of future value. Extensive attention has been paid to both methods in computational analyses of instrumental learning. By contrast, although a full computational analysis has been lacking, Pavlovian learning and prediction has typically been presumed to be solely model-free. Here, we revise that presumption and review compelling evidence from Pavlovian revaluation experiments showing that Pavlovian predictions can involve their own form of model-based evaluation. In model-based Pavlovian evaluation, prevailing states of the body and brain influence value computations, and thereby produce powerful incentive motivations that can sometimes be quite new. We consider the consequences of this revised Pavlovian view for the computational landscape of prediction, response, and choice. We also revisit differences between Pavlovian and instrumental learning in the control of incentive motivation.

摘要

有证据支持至少两种了解奖励和惩罚并做出预测以指导行动的方法。一种方法称为无模型方法，它从回顾性经验中逐步获取情境和行动长期价值的缓存估计。另一种方法称为基于模型的方法，它使用环境表征、期望和前瞻性计算来对未来价值进行认知预测。在工具性学习的计算分析中，这两种方法都受到了广泛关注。相比之下，尽管缺乏全面的计算分析，但经典条件作用学习和预测通常被认为完全是无模型的。在这里，我们修正这一假设，并回顾来自经典条件作用重新评估实验的有力证据，这些证据表明经典条件作用预测可能涉及它们自己形式的基于模型的评估。在基于模型的经典条件作用评估中，身体和大脑的主导状态会影响价值计算，从而产生有时可能相当新颖的强大激励动机。我们考虑这种修正后的经典条件作用观点对预测、反应和选择的计算格局的影响。我们还重新审视了经典条件作用和工具性学习在激励动机控制方面的差异。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于模型和无模型的巴甫洛夫奖励学习：重新评估、修正与揭示。

Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

基于模型和无模型的巴甫洛夫奖励学习：重新评估、修正与揭示。

Model-based and model-free Pavlovian reward learning: revaluation, revision, and revelation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献