人类在认知限制下适应性地解决探索-利用困境：来自多臂赌博机任务的证据。

Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task.

机构信息

Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA.

Department of Psychology, Pennsylvania State University, State College, PA, USA; Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

出版信息

Cognition. 2022 Dec;229:105233. doi: 10.1016/j.cognition.2022.105233. Epub 2022 Jul 30.

DOI:10.1016/j.cognition.2022.105233

PMID:35917612

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9530017/

Abstract

When navigating uncertain worlds, humans must balance exploring new options versus exploiting known rewards. Longer horizons and spatially structured option values encourage humans to explore, but the impact of real-world cognitive constraints such as environment size and memory demands on explore-exploit decisions is unclear. In the present study, humans chose between options varying in uncertainty during a multi-armed bandit task with varying environment size and memory demands. Regression and cognitive computational models of choice behavior showed that with a lower cognitive load, humans are more exploratory than a simulated value-maximizing learner, but under cognitive constraints, they adaptively scale down exploration to maintain exploitation. Thus, while humans are curious, cognitive constraints force people to decrease their strategic exploration in a resource-rational-like manner to focus on harvesting known rewards.

摘要

当人类在不确定的世界中导航时，他们必须在探索新选项和利用已知奖励之间取得平衡。更长的视野和空间结构的选项值鼓励人类进行探索，但现实世界认知限制（如环境大小和记忆需求）对探索-利用决策的影响尚不清楚。在本研究中，人类在一个具有不同环境大小和记忆需求的多臂老虎机任务中，在不确定性不同的选项之间进行选择。选择行为的回归和认知计算模型表明，在认知负荷较低的情况下，人类比模拟的最大化价值学习者更具探索性，但在认知限制下，他们会适应性地减少探索以保持利用。因此，虽然人类具有好奇心，但认知限制迫使人们以类似于资源理性的方式减少策略性探索，以专注于收获已知奖励。

相似文献

Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task.

Cognition. 2022 Dec;229:105233. doi: 10.1016/j.cognition.2022.105233. Epub 2022 Jul 30.

Sex differences in learning from exploration.

Elife. 2021 Nov 19;10:e69748. doi: 10.7554/eLife.69748.

Uncertainty and exploration in a restless bandit problem.

Top Cogn Sci. 2015 Apr;7(2):351-67. doi: 10.1111/tops.12145. Epub 2015 Apr 20.

Finding structure in multi-armed bandits.

Cogn Psychol. 2020 Jun;119:101261. doi: 10.1016/j.cogpsych.2019.101261. Epub 2020 Feb 12.

Development of directed and random exploration in children.

Dev Sci. 2021 Jul;24(4):e13095. doi: 10.1111/desc.13095. Epub 2021 Mar 8.

Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?

Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.

Transcranial Stimulation over Frontopolar Cortex Elucidates the Choice Attributes and Neural Mechanisms Used to Resolve Exploration-Exploitation Trade-Offs.

J Neurosci. 2015 Oct 28;35(43):14544-56. doi: 10.1523/JNEUROSCI.2322-15.2015.

Cortical substrates for exploratory decisions in humans.

Nature. 2006 Jun 15;441(7095):876-9. doi: 10.1038/nature04766.

Primate Orbitofrontal Cortex Codes Information Relevant for Managing Explore-Exploit Tradeoffs.

J Neurosci. 2020 Mar 18;40(12):2553-2561. doi: 10.1523/JNEUROSCI.2355-19.2020. Epub 2020 Feb 14.

Putting bandits into context: How function learning supports decision making.

J Exp Psychol Learn Mem Cogn. 2018 Jun;44(6):927-943. doi: 10.1037/xlm0000463. Epub 2017 Nov 13.

引用本文的文献

Dynamic prefrontal coupling coordinates adaptive decision-making.

Res Sq. 2025 Apr 9:rs.3.rs-6296852. doi: 10.21203/rs.3.rs-6296852/v1.

Perceptual Novelty Drives Early Exploration in a Bottom-Up Manner.

Dev Sci. 2025 May;28(3):e70002. doi: 10.1111/desc.70002.

Negative affect-driven impulsivity as hierarchical model-based overgeneralization.

Trends Cogn Sci. 2025 May;29(5):407-420. doi: 10.1016/j.tics.2025.01.002. Epub 2025 Feb 6.

Humans rationally balance detailed and temporally abstract world models.

Commun Psychol. 2025 Jan 4;3(1):1. doi: 10.1038/s44271-024-00169-3.

Active learning with human heuristics: an algorithm robust to labeling bias.

Front Artif Intell. 2024 Nov 19;7:1491932. doi: 10.3389/frai.2024.1491932. eCollection 2024.

Revisiting the role of computational neuroimaging in the era of integrative neuroscience.

Neuropsychopharmacology. 2024 Nov;50(1):103-113. doi: 10.1038/s41386-024-01946-8. Epub 2024 Sep 6.

Bayesian Reinforcement Learning With Limited Cognitive Load.

Open Mind (Camb). 2024 Apr 3;8:395-438. doi: 10.1162/opmi_a_00132. eCollection 2024.

The structure and development of explore-exploit decision making.

Cogn Psychol. 2024 May;150:101650. doi: 10.1016/j.cogpsych.2024.101650. Epub 2024 Mar 10.

Common and distinct equity preferences in children and adults.

Front Psychol. 2024 Feb 14;15:1330024. doi: 10.3389/fpsyg.2024.1330024. eCollection 2024.

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

Sci Adv. 2024 Feb 23;10(8):eadj2219. doi: 10.1126/sciadv.adj2219.

本文引用的文献

Stan: A Probabilistic Programming Language.

J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.

Time pressure changes how people explore and respond to uncertainty.

Sci Rep. 2022 Mar 8;12(1):4122. doi: 10.1038/s41598-022-07901-1.

Human complex exploration strategies are enriched by noradrenaline-modulated heuristics.

Elife. 2021 Jan 4;10:e59907. doi: 10.7554/eLife.59907.

Improving the Reliability of Computational Analyses: Model-Based Planning and Its Relationship With Compulsivity.

Biol Psychiatry Cogn Neurosci Neuroimaging. 2020 Jun;5(6):601-609. doi: 10.1016/j.bpsc.2019.12.019. Epub 2020 Jan 13.

Searching for Rewards Like a Child Means Less Generalization and More Directed Exploration.

Psychol Sci. 2019 Nov;30(11):1561-1572. doi: 10.1177/0956797619863663. Epub 2019 Oct 25.

Structured, uncertainty-driven exploration in real-world consumer choice.

Proc Natl Acad Sci U S A. 2019 Jul 9;116(28):13903-13908. doi: 10.1073/pnas.1821028116. Epub 2019 Jun 24.

Subcortical Substrates of Explore-Exploit Decisions in Primates.

Neuron. 2019 Aug 7;103(3):533-545.e5. doi: 10.1016/j.neuron.2019.05.017. Epub 2019 Jun 10.

Generalization guides human exploration in vast decision spaces.

Nat Hum Behav. 2018 Dec;2(12):915-924. doi: 10.1038/s41562-018-0467-4. Epub 2018 Nov 12.

Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources.

Behav Brain Sci. 2019 Feb 4;43:e1. doi: 10.1017/S0140525X1900061X.

Should we control? The interplay between cognitive control and information integration in the resolution of the exploration-exploitation dilemma.

J Exp Psychol Gen. 2019 Jun;148(6):977-993. doi: 10.1037/xge0000546. Epub 2019 Jan 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

人类在认知限制下适应性地解决探索-利用困境：来自多臂赌博机任务的证据。

Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献