• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

作为上下文博弈的信号检测模型

Signal detection models as contextual bandits.

作者信息

Sherratt Thomas N, O'Neill Erica

机构信息

Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, Canada K1S 5B6.

出版信息

R Soc Open Sci. 2023 Jun 21;10(6):230157. doi: 10.1098/rsos.230157. eCollection 2023 Jun.

DOI:10.1098/rsos.230157
PMID:37351497
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10282591/
Abstract

Signal detection theory (SDT) has been widely applied to identify the optimal discriminative decisions of receivers under uncertainty. However, the approach assumes that decision-makers immediately adopt the appropriate acceptance threshold, even though the optimal response must often be learned. Here we recast the classical normal-normal (and power-law) signal detection model as a contextual multi-armed bandit (CMAB). Thus, rather than starting with complete information, decision-makers must infer how the magnitude of a continuous cue is related to the probability that a signaller is desirable, while simultaneously seeking to exploit the information they acquire. We explain how various CMAB heuristics resolve the trade-off between better estimating the underlying relationship and exploiting it. Next, we determined how naive human volunteers resolve signal detection problems with a continuous cue. As anticipated, a model of choice (accept/reject) that assumed volunteers immediately adopted the SDT-predicted acceptance threshold did not predict volunteer behaviour well. The Softmax rule for solving CMABs, with choices based on a logistic function of the expected payoffs, best explained the decisions of our volunteers but a simple midpoint algorithm also predicted decisions well under some conditions. CMABs offer principled parametric solutions to solving many classical SDT problems when decision-makers start with incomplete information.

摘要

信号检测理论(SDT)已被广泛应用于确定接收者在不确定性下的最优判别决策。然而,该方法假定决策者会立即采用适当的接受阈值,尽管最优反应往往需要学习。在这里,我们将经典的正态-正态(和幂律)信号检测模型重塑为情境多臂老虎机(CMAB)。因此,决策者并非一开始就拥有完整信息,而是必须推断连续线索的大小与信号发出者是否理想的概率之间的关系,同时还要设法利用所获取的信息。我们解释了各种CMAB启发式方法如何解决在更好地估计潜在关系和利用该关系之间的权衡。接下来,我们确定了天真的人类志愿者如何利用连续线索解决信号检测问题。不出所料,一个假设志愿者立即采用SDT预测的接受阈值的选择(接受/拒绝)模型并不能很好地预测志愿者的行为。用于解决CMAB的Softmax规则,其选择基于预期收益的逻辑函数,最能解释我们志愿者的决策,但在某些情况下,一个简单的中点算法也能很好地预测决策。当决策者从信息不完整开始时,CMAB为解决许多经典的SDT问题提供了有原则的参数化解决方案。

相似文献

1
Signal detection models as contextual bandits.作为上下文博弈的信号检测模型
R Soc Open Sci. 2023 Jun 21;10(6):230157. doi: 10.1098/rsos.230157. eCollection 2023 Jun.
2
An empirical evaluation of active inference in multi-armed bandits.多臂赌博机中主动推理的实证评估。
Neural Netw. 2021 Dec;144:229-246. doi: 10.1016/j.neunet.2021.08.018. Epub 2021 Aug 26.
3
A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials.一种基于情境博弈的临床试验明智决策方法。
Life (Basel). 2022 Aug 21;12(8):1277. doi: 10.3390/life12081277.
4
Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models.基于神经网络和能量模型的上下文博弈中的最大熵探索
Entropy (Basel). 2023 Jan 18;25(2):188. doi: 10.3390/e25020188.
5
Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?基于筛沙机制的超越方法:为何乐观值函数能在多臂老虎机问题中找到最优解?
Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.
6
Decision-making without a brain: how an amoeboid organism solves the two-armed bandit.无大脑的决策:一种阿米巴样生物如何解决双臂赌博机问题。
J R Soc Interface. 2016 Jun;13(119). doi: 10.1098/rsif.2016.0030.
7
Mating with Multi-Armed Bandits: Reinforcement Learning Models of Human Mate Search.与多臂赌博机的匹配:人类配偶搜索的强化学习模型
Open Mind (Camb). 2024 Aug 15;8:995-1011. doi: 10.1162/opmi_a_00156. eCollection 2024.
8
Predicting Ecological Momentary Assessments in an App for Tinnitus by Learning From Each User's Stream With a Contextual Multi-Armed Bandit.通过使用上下文多臂老虎机从每个用户的信息流中学习,预测耳鸣应用程序中的生态瞬时评估。
Front Neurosci. 2022 Apr 11;16:836834. doi: 10.3389/fnins.2022.836834. eCollection 2022.
9
Optimism in the face of uncertainty supported by a statistically-designed multi-armed bandit algorithm.面对不确定性时的乐观态度由一种经过统计设计的多臂赌博机算法提供支持。
Biosystems. 2017 Oct;160:25-32. doi: 10.1016/j.biosystems.2017.08.004. Epub 2017 Aug 22.
10
Theory of choice in bandit, information sampling and foraging tasks.强盗任务、信息采样和觅食任务中的选择理论。
PLoS Comput Biol. 2015 Mar 27;11(3):e1004164. doi: 10.1371/journal.pcbi.1004164. eCollection 2015 Mar.

引用本文的文献

1
Who innovates? Abundance of novel and familiar food changes which animals are most persistent.谁在创新?丰富的新颖和熟悉的食物变化改变了动物最持久的选择。
Proc Biol Sci. 2024 Jan 31;291(2015):20231936. doi: 10.1098/rspb.2023.1936. Epub 2024 Jan 17.

本文引用的文献

1
Coping with Danger and Deception: Lessons from Signal Detection Theory.应对危险与欺骗:来自信号检测理论的启示。
Am Nat. 2021 Feb;197(2):147-163. doi: 10.1086/712246. Epub 2020 Dec 23.
2
Skilled bandits: Learning to choose in a reactive world.熟练的劫匪:在反应性世界中学会选择。
J Exp Psychol Learn Mem Cogn. 2021 Jun;47(6):879-905. doi: 10.1037/xlm0000981. Epub 2020 Nov 30.
3
Signal detection, acceptance thresholds and the evolution of animal recognition systems.信号检测、接受阈值与动物识别系统的进化
Philos Trans R Soc Lond B Biol Sci. 2020 Jul 6;375(1802):20190464. doi: 10.1098/rstb.2019.0464. Epub 2020 May 18.
4
The Cognitive Ecology of Stimulus Ambiguity: A Predator-Prey Perspective.刺激模糊的认知生态学:捕食者-猎物视角。
Trends Ecol Evol. 2019 Nov;34(11):1048-1060. doi: 10.1016/j.tree.2019.07.004. Epub 2019 Aug 12.
5
Putting bandits into context: How function learning supports decision making.将匪帮置于情境中:功能学习如何支持决策制定。
J Exp Psychol Learn Mem Cogn. 2018 Jun;44(6):927-943. doi: 10.1037/xlm0000463. Epub 2017 Nov 13.
6
The erroneous signals of detection theory.检测理论的错误信号。
Proc Biol Sci. 2017 Oct 25;284(1865). doi: 10.1098/rspb.2017.1852.
7
Decision-making without a brain: how an amoeboid organism solves the two-armed bandit.无大脑的决策:一种阿米巴样生物如何解决双臂赌博机问题。
J R Soc Interface. 2016 Jun;13(119). doi: 10.1098/rsif.2016.0030.
8
Humans use directed and random exploration to solve the explore-exploit dilemma.人类利用有向探索和随机探索来解决探索与利用的两难困境。
J Exp Psychol Gen. 2014 Dec;143(6):2074-81. doi: 10.1037/a0038199. Epub 2014 Oct 27.
9
"Utilizing" signal detection theory.运用信号检测理论。
Psychol Sci. 2014 Sep;25(9):1663-73. doi: 10.1177/0956797614541991. Epub 2014 Aug 5.
10
Stimulus salience as an explanation for imperfect mimicry.刺激显著性解释了不完全模仿现象。
Curr Biol. 2014 May 5;24(9):965-9. doi: 10.1016/j.cub.2014.02.061. Epub 2014 Apr 10.