Cognitive Science Program and Department of Psychological and Brain Sciences, Indiana University Bloomington.
Indeed, Inc.
Cogn Sci. 2020 Feb;44(2):e12817. doi: 10.1111/cogs.12817.
How, and how well, do people switch between exploration and exploitation to search for and accumulate resources? We study the decision processes underlying such exploration/exploitation trade-offs using a novel card selection task that captures the common situation of searching among multiple resources (e.g., jobs) that can be exploited without depleting. With experience, participants learn to switch appropriately between exploration and exploitation and approach optimal performance. We model participants' behavior on this task with random, threshold, and sampling strategies, and find that a linear decreasing threshold rule best fits participants' results. Further evidence that participants use decreasing threshold-based strategies comes from reaction time differences between exploration and exploitation; however, participants themselves report non-decreasing thresholds. Decreasing threshold strategies that "front-load" exploration and switch quickly to exploitation are particularly effective in resource accumulation tasks, in contrast to optimal stopping problems like the Secretary Problem requiring longer exploration.
人们如何以及在多大程度上能够在探索和利用之间进行切换,以搜索和积累资源?我们使用一种新颖的卡片选择任务来研究这种探索/利用权衡背后的决策过程,该任务可以在不耗尽资源的情况下捕获对多种资源(例如工作)的常见搜索情况。随着经验的积累,参与者学会在探索和利用之间进行适当的切换,并接近最佳表现。我们使用随机、阈值和抽样策略对参与者在该任务中的行为进行建模,发现线性递减阈值规则最符合参与者的结果。参与者使用基于递减阈值的策略的进一步证据来自于探索和利用之间的反应时间差异;然而,参与者自己报告的阈值是非递减的。在资源积累任务中,递减阈值策略“前置”探索并快速切换到利用,这特别有效,而像秘书问题这样的最优停止问题则需要更长的探索时间。