Suppr超能文献

人类强化学习的分层结构计算证据。

Computational evidence for hierarchically structured reinforcement learning in humans.

机构信息

Department of Psychology, University of California, Berkeley, CA 94704.

Department of Psychology, University of California, Berkeley, CA 94704

出版信息

Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29381-29389. doi: 10.1073/pnas.1912330117.

Abstract

Humans have the fascinating ability to achieve goals in a complex and constantly changing world, still surpassing modern machine-learning algorithms in terms of flexibility and learning speed. It is generally accepted that a crucial factor for this ability is the use of abstract, hierarchical representations, which employ structure in the environment to guide learning and decision making. Nevertheless, how we create and use these hierarchical representations is poorly understood. This study presents evidence that human behavior can be characterized as hierarchical reinforcement learning (RL). We designed an experiment to test specific predictions of hierarchical RL using a series of subtasks in the realm of context-based learning and observed several behavioral markers of hierarchical RL, such as asymmetric switch costs between changes in higher-level versus lower-level features, faster learning in higher-valued compared to lower-valued contexts, and preference for higher-valued compared to lower-valued contexts. We replicated these results across three independent samples. We simulated three models-a classic RL, a hierarchical RL, and a hierarchical Bayesian model-and compared their behavior to human results. While the flat RL model captured some aspects of participants' sensitivity to outcome values, and the hierarchical Bayesian model captured some markers of transfer, only hierarchical RL accounted for all patterns observed in human behavior. This work shows that hierarchical RL, a biologically inspired and computationally simple algorithm, can capture human behavior in complex, hierarchical environments and opens the avenue for future research in this field.

摘要

人类拥有在复杂且不断变化的世界中实现目标的惊人能力,在灵活性和学习速度方面仍然超过现代机器学习算法。人们普遍认为,这种能力的一个关键因素是使用抽象的、分层的表示形式,利用环境中的结构来指导学习和决策。然而,我们如何创建和使用这些分层表示形式还知之甚少。本研究表明,人类的行为可以被描述为分层强化学习(RL)。我们设计了一个实验,使用基于上下文学习领域中的一系列子任务来测试分层 RL 的具体预测,并观察到了几个分层 RL 的行为标记,例如在高级别与低级别特征变化之间的不对称切换成本、在高价值与低价值环境中更快的学习以及对高价值与低价值环境的偏好。我们在三个独立的样本中复制了这些结果。我们模拟了三个模型——经典 RL、分层 RL 和分层贝叶斯模型,并将它们的行为与人类的结果进行了比较。虽然平面 RL 模型捕捉到了参与者对结果值敏感性的某些方面,而分层贝叶斯模型捕捉到了一些转移的标记,但只有分层 RL 解释了人类行为中观察到的所有模式。这项工作表明,分层 RL,一种受生物启发且计算简单的算法,可以捕捉到人类在复杂分层环境中的行为,并为该领域的未来研究开辟了道路。

相似文献

7
Novelty and Inductive Generalization in Human Reinforcement Learning.人类强化学习中的新颖性与归纳概括
Top Cogn Sci. 2015 Jul;7(3):391-415. doi: 10.1111/tops.12138. Epub 2015 Mar 23.
8
Reinforcement learning and human behavior.强化学习与人类行为。
Curr Opin Neurobiol. 2014 Apr;25:93-8. doi: 10.1016/j.conb.2013.12.004. Epub 2014 Jan 1.
9
Learning and forgetting using reinforced Bayesian change detection.基于强化贝叶斯变化检测的学习和遗忘。
PLoS Comput Biol. 2019 Apr 17;15(4):e1006713. doi: 10.1371/journal.pcbi.1006713. eCollection 2019 Apr.

引用本文的文献

1
Action subsampling supports policy compression in large action spaces.动作子采样支持在大型动作空间中进行策略压缩。
PLoS Comput Biol. 2025 Sep 5;21(9):e1013444. doi: 10.1371/journal.pcbi.1013444. eCollection 2025 Sep.
5
Schemas, reinforcement learning and the medial prefrontal cortex.图式、强化学习与内侧前额叶皮质
Nat Rev Neurosci. 2025 Mar;26(3):141-157. doi: 10.1038/s41583-024-00893-z. Epub 2025 Jan 7.
8
Cognitive Control.认知控制
Annu Rev Psychol. 2025 Jan;76(1):167-195. doi: 10.1146/annurev-psych-022024-103901. Epub 2024 Dec 3.

本文引用的文献

1
Discovery of hierarchical representations for efficient planning.发现用于有效规划的分层表示。
PLoS Comput Biol. 2020 Apr 6;16(4):e1007594. doi: 10.1371/journal.pcbi.1007594. eCollection 2020 Apr.
2
On The Necessity of Abstraction.论抽象的必要性。
Curr Opin Behav Sci. 2019 Oct;29:1-7. doi: 10.1016/j.cobeha.2018.11.005. Epub 2018 Dec 14.
3
The successor representation in human reinforcement learning.人类强化学习中的后继表示
Nat Hum Behav. 2017 Sep;1(9):680-692. doi: 10.1038/s41562-017-0180-8. Epub 2017 Aug 28.
5
Comparing continual task learning in minds and machines.比较心智和机器中的持续任务学习。
Proc Natl Acad Sci U S A. 2018 Oct 30;115(44):E10313-E10322. doi: 10.1073/pnas.1800755115. Epub 2018 Oct 15.
6
Prefrontal cortex as a meta-reinforcement learning system.前额皮质作为一个元强化学习系统。
Nat Neurosci. 2018 Jun;21(6):860-868. doi: 10.1038/s41593-018-0147-8. Epub 2018 May 14.
8
The Importance of Falsification in Computational Cognitive Modeling.计算认知建模中伪造的重要性。
Trends Cogn Sci. 2017 Jun;21(6):425-433. doi: 10.1016/j.tics.2017.03.011. Epub 2017 May 2.
9
The Cost of Structure Learning.结构学习的代价。
J Cogn Neurosci. 2017 Oct;29(10):1646-1655. doi: 10.1162/jocn_a_01128. Epub 2017 Mar 30.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验