Suppr超能文献

解构人类的探索算法。

Deconstructing the human algorithms for exploration.

机构信息

Department of Psychology and Center for Brain Science, Harvard University, United States.

出版信息

Cognition. 2018 Apr;173:34-42. doi: 10.1016/j.cognition.2017.12.014. Epub 2017 Dec 29.

Abstract

The dilemma between information gathering (exploration) and reward seeking (exploitation) is a fundamental problem for reinforcement learning agents. How humans resolve this dilemma is still an open question, because experiments have provided equivocal evidence about the underlying algorithms used by humans. We show that two families of algorithms can be distinguished in terms of how uncertainty affects exploration. Algorithms based on uncertainty bonuses predict a change in response bias as a function of uncertainty, whereas algorithms based on sampling predict a change in response slope. Two experiments provide evidence for both bias and slope changes, and computational modeling confirms that a hybrid model is the best quantitative account of the data.

摘要

在信息收集(探索)和奖励寻求(利用)之间的困境是强化学习代理的一个基本问题。人类如何解决这个困境仍然是一个悬而未决的问题,因为实验提供了关于人类使用的潜在算法的模棱两可的证据。我们表明,可以根据不确定性如何影响探索来区分两类算法。基于不确定性奖金的算法预测响应偏差的变化作为不确定性的函数,而基于采样的算法预测响应斜率的变化。两项实验为偏差和斜率变化都提供了证据,计算模型证实混合模型是对数据的最佳定量描述。

相似文献

1
Deconstructing the human algorithms for exploration.解构人类的探索算法。
Cognition. 2018 Apr;173:34-42. doi: 10.1016/j.cognition.2017.12.014. Epub 2017 Dec 29.
5
Novelty and Inductive Generalization in Human Reinforcement Learning.人类强化学习中的新颖性与归纳概括
Top Cogn Sci. 2015 Jul;7(3):391-415. doi: 10.1111/tops.12138. Epub 2015 Mar 23.
7
Dopaminergic genes are associated with both directed and random exploration.多巴胺能基因与定向探索和随机探索都有关联。
Neuropsychologia. 2018 Nov;120:97-104. doi: 10.1016/j.neuropsychologia.2018.10.009. Epub 2018 Oct 19.

引用本文的文献

本文引用的文献

2
The effect of atomoxetine on random and directed exploration in humans.托莫西汀对人类随机和定向探索的影响。
PLoS One. 2017 Apr 26;12(4):e0176034. doi: 10.1371/journal.pone.0176034. eCollection 2017.
3
Charting the expansion of strategic exploratory behavior during adolescence.绘制青春期策略性探索行为的扩展图。
J Exp Psychol Gen. 2017 Feb;146(2):155-164. doi: 10.1037/xge0000250. Epub 2016 Dec 15.
4
Optimal policy for value-based decision-making.基于价值的决策的最优策略。
Nat Commun. 2016 Aug 18;7:12400. doi: 10.1038/ncomms12400.
5
Uncertainty and exploration in a restless bandit problem.动态强盗问题中的不确定性与探索
Top Cogn Sci. 2015 Apr;7(2):351-67. doi: 10.1111/tops.12145. Epub 2015 Apr 20.
6
Discovering hierarchical motion structure.发现层次运动结构。
Vision Res. 2016 Sep;126:232-241. doi: 10.1016/j.visres.2015.03.004. Epub 2015 Mar 26.
7
Novelty and Inductive Generalization in Human Reinforcement Learning.人类强化学习中的新颖性与归纳概括
Top Cogn Sci. 2015 Jul;7(3):391-415. doi: 10.1111/tops.12138. Epub 2015 Mar 23.
9
The algorithmic anatomy of model-based evaluation.基于模型评估的算法剖析。
Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655). doi: 10.1098/rstb.2013.0478.
10
Physiological and behavioral signatures of reflective exploratory choice.反思性探索选择的生理和行为特征
Cogn Affect Behav Neurosci. 2014 Dec;14(4):1167-83. doi: 10.3758/s13415-014-0260-4.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验