作为上下文博弈的信号检测模型

Signal detection models as contextual bandits.

作者信息

Sherratt Thomas N, O'Neill Erica

机构信息

Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, Canada K1S 5B6.

出版信息

R Soc Open Sci. 2023 Jun 21;10(6):230157. doi: 10.1098/rsos.230157. eCollection 2023 Jun.

DOI:10.1098/rsos.230157

PMID:37351497

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10282591/

Abstract

Signal detection theory (SDT) has been widely applied to identify the optimal discriminative decisions of receivers under uncertainty. However, the approach assumes that decision-makers immediately adopt the appropriate acceptance threshold, even though the optimal response must often be learned. Here we recast the classical normal-normal (and power-law) signal detection model as a contextual multi-armed bandit (CMAB). Thus, rather than starting with complete information, decision-makers must infer how the magnitude of a continuous cue is related to the probability that a signaller is desirable, while simultaneously seeking to exploit the information they acquire. We explain how various CMAB heuristics resolve the trade-off between better estimating the underlying relationship and exploiting it. Next, we determined how naive human volunteers resolve signal detection problems with a continuous cue. As anticipated, a model of choice (accept/reject) that assumed volunteers immediately adopted the SDT-predicted acceptance threshold did not predict volunteer behaviour well. The Softmax rule for solving CMABs, with choices based on a logistic function of the expected payoffs, best explained the decisions of our volunteers but a simple midpoint algorithm also predicted decisions well under some conditions. CMABs offer principled parametric solutions to solving many classical SDT problems when decision-makers start with incomplete information.

摘要

信号检测理论（SDT）已被广泛应用于确定接收者在不确定性下的最优判别决策。然而，该方法假定决策者会立即采用适当的接受阈值，尽管最优反应往往需要学习。在这里，我们将经典的正态-正态（和幂律）信号检测模型重塑为情境多臂老虎机（CMAB）。因此，决策者并非一开始就拥有完整信息，而是必须推断连续线索的大小与信号发出者是否理想的概率之间的关系，同时还要设法利用所获取的信息。我们解释了各种CMAB启发式方法如何解决在更好地估计潜在关系和利用该关系之间的权衡。接下来，我们确定了天真的人类志愿者如何利用连续线索解决信号检测问题。不出所料，一个假设志愿者立即采用SDT预测的接受阈值的选择（接受/拒绝）模型并不能很好地预测志愿者的行为。用于解决CMAB的Softmax规则，其选择基于预期收益的逻辑函数，最能解释我们志愿者的决策，但在某些情况下，一个简单的中点算法也能很好地预测决策。当决策者从信息不完整开始时，CMAB为解决许多经典的SDT问题提供了有原则的参数化解决方案。

相似文献

Signal detection models as contextual bandits.

R Soc Open Sci. 2023 Jun 21;10(6):230157. doi: 10.1098/rsos.230157. eCollection 2023 Jun.

An empirical evaluation of active inference in multi-armed bandits.

Neural Netw. 2021 Dec;144:229-246. doi: 10.1016/j.neunet.2021.08.018. Epub 2021 Aug 26.

A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials.

Life (Basel). 2022 Aug 21;12(8):1277. doi: 10.3390/life12081277.

Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models.

Entropy (Basel). 2023 Jan 18;25(2):188. doi: 10.3390/e25020188.

Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?

Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.

Decision-making without a brain: how an amoeboid organism solves the two-armed bandit.

J R Soc Interface. 2016 Jun;13(119). doi: 10.1098/rsif.2016.0030.

Mating with Multi-Armed Bandits: Reinforcement Learning Models of Human Mate Search.

Open Mind (Camb). 2024 Aug 15;8:995-1011. doi: 10.1162/opmi_a_00156. eCollection 2024.

Predicting Ecological Momentary Assessments in an App for Tinnitus by Learning From Each User's Stream With a Contextual Multi-Armed Bandit.

Front Neurosci. 2022 Apr 11;16:836834. doi: 10.3389/fnins.2022.836834. eCollection 2022.

Optimism in the face of uncertainty supported by a statistically-designed multi-armed bandit algorithm.

Biosystems. 2017 Oct;160:25-32. doi: 10.1016/j.biosystems.2017.08.004. Epub 2017 Aug 22.

Theory of choice in bandit, information sampling and foraging tasks.

PLoS Comput Biol. 2015 Mar 27;11(3):e1004164. doi: 10.1371/journal.pcbi.1004164. eCollection 2015 Mar.

引用本文的文献

Who innovates? Abundance of novel and familiar food changes which animals are most persistent.

Proc Biol Sci. 2024 Jan 31;291(2015):20231936. doi: 10.1098/rspb.2023.1936. Epub 2024 Jan 17.

本文引用的文献

Coping with Danger and Deception: Lessons from Signal Detection Theory.

Am Nat. 2021 Feb;197(2):147-163. doi: 10.1086/712246. Epub 2020 Dec 23.

Skilled bandits: Learning to choose in a reactive world.

J Exp Psychol Learn Mem Cogn. 2021 Jun;47(6):879-905. doi: 10.1037/xlm0000981. Epub 2020 Nov 30.

Signal detection, acceptance thresholds and the evolution of animal recognition systems.

Philos Trans R Soc Lond B Biol Sci. 2020 Jul 6;375(1802):20190464. doi: 10.1098/rstb.2019.0464. Epub 2020 May 18.

The Cognitive Ecology of Stimulus Ambiguity: A Predator-Prey Perspective.

Trends Ecol Evol. 2019 Nov;34(11):1048-1060. doi: 10.1016/j.tree.2019.07.004. Epub 2019 Aug 12.

Putting bandits into context: How function learning supports decision making.

J Exp Psychol Learn Mem Cogn. 2018 Jun;44(6):927-943. doi: 10.1037/xlm0000463. Epub 2017 Nov 13.

The erroneous signals of detection theory.

Proc Biol Sci. 2017 Oct 25;284(1865). doi: 10.1098/rspb.2017.1852.

Decision-making without a brain: how an amoeboid organism solves the two-armed bandit.

J R Soc Interface. 2016 Jun;13(119). doi: 10.1098/rsif.2016.0030.

Humans use directed and random exploration to solve the explore-exploit dilemma.

J Exp Psychol Gen. 2014 Dec;143(6):2074-81. doi: 10.1037/a0038199. Epub 2014 Oct 27.

"Utilizing" signal detection theory.

Psychol Sci. 2014 Sep;25(9):1663-73. doi: 10.1177/0956797614541991. Epub 2014 Aug 5.

Stimulus salience as an explanation for imperfect mimicry.

Curr Biol. 2014 May 5;24(9):965-9. doi: 10.1016/j.cub.2014.02.061. Epub 2014 Apr 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

作为上下文博弈的信号检测模型

Signal detection models as contextual bandits.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献