• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

未被选择选项的贬值:对过度乐观期望的起源与维持的贝叶斯解释。

Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.

作者信息

Zhou Corey Yishan, Guo Dalin, Yu Angela J

机构信息

Department of Cognitive Science, University of California, San Diego La Jolla, CA 92093 USA.

出版信息

Cogsci. 2020 Jul-Aug;42:1682-1688.

PMID:34355220
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8336429/
Abstract

Humans frequently overestimate the likelihood of desirable events while underestimating the likelihood of undesirable ones: a phenomenon known as . Previously, it was suggested that unrealistic optimism arises from asymmetric belief updating, with a relatively reduced coding of undesirable information. Prior studies have shown that a reinforcement learning (RL) model with asymmetric learning rates (greater for a positive prediction error than a negative prediction error) could account for unrealistic optimism in a bandit task, in particular the tendency of human subjects to persistently choosing a single option when there are multiple equally good options. Here, we propose an alternative explanation of such persistent behavior, by modeling human behavior using a Bayesian hidden Markov model, the Dynamic Belief Model (DBM). We find that DBM captures human choice behavior better than the previously proposed asymmetric RL model. Whereas asymmetric RL attains a measure of optimism by giving better-than-expected outcomes higher learning weights compared to worse-than-expected outcomes, DBM does so by progressively devaluing the unchosen options, thus placing a greater emphasis on independent of reward outcome (e.g. an oft-chosen option might continue to be preferred even if it has not been particularly rewarding), which has broadly been shown to underlie sequential effects in a variety of behavioral settings. Moreover, previous work showed that the devaluation of unchosen options in DBM helps to compensate for a default assumption of environmental non-stationarity, thus allowing the decision-maker to both be more adaptive in changing environments and still obtain near-optimal performance in stationary environments. Thus, the current work suggests both a novel rationale and mechanism for persistent behavior in bandit tasks.

摘要

人类常常高估合意事件发生的可能性,同时低估不合意事件发生的可能性:这一现象被称为 。此前,有人提出不切实际的乐观主义源于不对称的信念更新,即对不合意信息的编码相对减少。先前的研究表明,一种具有不对称学习率(正预测误差的学习率大于负预测误差)的强化学习(RL)模型可以解释在强盗任务中不切实际的乐观主义,特别是人类受试者在有多个同样好的选项时持续选择单一选项的倾向。在这里,我们通过使用贝叶斯隐马尔可夫模型——动态信念模型(DBM)对人类行为进行建模,提出了对这种持续行为的另一种解释。我们发现DBM比先前提出的不对称RL模型能更好地捕捉人类的选择行为。与不对称RL通过给予比预期更好的结果更高的学习权重(相比比预期更差的结果)来达到一定程度的乐观主义不同,DBM是通过逐步贬低未被选择的选项来做到这一点的,从而更加强调 独立于奖励结果(例如,一个经常被选择的选项可能即使没有特别丰厚的回报也会继续受到青睐),这在各种行为情境中已被广泛证明是序列效应的基础。此外,先前的研究表明,DBM中未被选择选项的贬值有助于补偿对环境非平稳性的默认假设,从而使决策者在变化的环境中更具适应性,同时在稳定的环境中仍能获得接近最优的表现。因此,当前的研究工作为强盗任务中的持续行为提出了一个新颖的基本原理和机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/086f/8336429/730e54f76516/nihms-1725395-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/086f/8336429/002d713502e3/nihms-1725395-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/086f/8336429/730e54f76516/nihms-1725395-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/086f/8336429/002d713502e3/nihms-1725395-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/086f/8336429/730e54f76516/nihms-1725395-f0002.jpg

相似文献

1
Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.未被选择选项的贬值:对过度乐观期望的起源与维持的贝叶斯解释。
Cogsci. 2020 Jul-Aug;42:1682-1688.
2
Overtaking method based on sand-sifter mechanism: Why do optimistic value functions find optimal solutions in multi-armed bandit problems?基于筛沙机制的超越方法:为何乐观值函数能在多臂老虎机问题中找到最优解?
Biosystems. 2015 Sep;135:55-65. doi: 10.1016/j.biosystems.2015.06.009. Epub 2015 Jul 10.
3
Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.重新审视不确定性驱动的探索在(感知到的)非平稳世界中的作用。
Cogsci. 2021 Jul;43:2045-2051.
4
A pessimistic view of optimistic belief updating.对乐观信念更新的悲观看法。
Cogn Psychol. 2016 Nov;90:71-127. doi: 10.1016/j.cogpsych.2016.05.004. Epub 2016 Aug 16.
5
Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task.甲基苯丙胺成瘾中统计学习与决策的改变:来自双臂赌博任务的证据
Front Psychol. 2015 Dec 18;6:1910. doi: 10.3389/fpsyg.2015.01910. eCollection 2015.
6
Computational mechanisms underlying latent value updating of unchosen actions.潜在未选动作价值更新的计算机制。
Sci Adv. 2023 Oct 20;9(42):eadi2704. doi: 10.1126/sciadv.adi2704.
7
Primate Orbitofrontal Cortex Codes Information Relevant for Managing Explore-Exploit Tradeoffs.灵长类动物眶额皮层对管理探索-开发权衡相关信息的编码。
J Neurosci. 2020 Mar 18;40(12):2553-2561. doi: 10.1523/JNEUROSCI.2355-19.2020. Epub 2020 Feb 14.
8
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation.揭开人类学习过度波动之谜:贝叶斯持久先验与神经近似法
Adv Neural Inf Process Syst. 2018 Dec;31:2781-2790.
9
A Normative Account of Confirmation Bias During Reinforcement Learning.强化学习中确认偏差的规范解释。
Neural Comput. 2022 Jan 14;34(2):307-337. doi: 10.1162/neco_a_01455.
10
Anhedonia and anxiety underlying depressive symptomatology have distinct effects on reward-based decision-making.快感缺失和抑郁症状背后的焦虑对基于奖励的决策有不同影响。
PLoS One. 2017 Oct 23;12(10):e0186473. doi: 10.1371/journal.pone.0186473. eCollection 2017.

引用本文的文献

1
Assessing social anhedonia in a transdiagnostic sample: Insights from a computational psychiatry lens.在跨诊断样本中评估社交快感缺失:来自计算精神病学视角的见解。
J Mood Anxiety Disord. 2024 Sep 17;8:100088. doi: 10.1016/j.xjmad.2024.100088. eCollection 2024 Dec.
2
Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.重新审视不确定性驱动的探索在(感知到的)非平稳世界中的作用。
Cogsci. 2021 Jul;43:2045-2051.

本文引用的文献

1
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation.揭开人类学习过度波动之谜:贝叶斯持久先验与神经近似法
Adv Neural Inf Process Syst. 2018 Dec;31:2781-2790.
2
Choice history biases subsequent evidence accumulation.选择历史会影响后续证据的积累。
Elife. 2019 Jul 2;8:e46331. doi: 10.7554/eLife.46331.
3
Optimistic bias in young adults for cancer, cardiovascular and respiratory diseases: A pilot study on smokers and drinkers.年轻人对癌症、心血管和呼吸道疾病的乐观偏见:吸烟者和饮酒者的初步研究。
J Health Psychol. 2018 Apr;23(5):645-656. doi: 10.1177/1359105316667796. Epub 2016 Sep 13.
4
Sequential effects: Superstition or rational behavior?序列效应:迷信还是理性行为?
Adv Neural Inf Process Syst. 2008;21:1873-1880.
5
Statistical learning and adaptive decision-making underlie human response time variability in inhibitory control.统计学习和适应性决策是人类抑制控制中反应时间变异性的基础。
Front Psychol. 2015 Aug 11;6:1046. doi: 10.3389/fpsyg.2015.01046. eCollection 2015.
6
Sequential effects in response time reveal learning mechanisms and event representations.反应时中的序列效应揭示了学习机制和事件表征。
Psychol Rev. 2013 Jul;120(3):628-66. doi: 10.1037/a0033180.
7
How unrealistic optimism is maintained in the face of reality.面对现实,人们如何保持不切实际的乐观。
Nat Neurosci. 2011 Oct 9;14(11):1475-9. doi: 10.1038/nn.2949.
8
The dark side of optimism: unrealistic optimism about problems with alcohol predicts subsequent negative event experiences.乐观的阴暗面:对酒精问题的不切实际的乐观预测了随后的负面事件体验。
Pers Soc Psychol Bull. 2009 Nov;35(11):1540-50. doi: 10.1177/0146167209343124. Epub 2009 Aug 31.
9
Heart disease risk perception in college men and women.大学男女对心脏病风险的认知。
J Am Coll Health. 2003 Mar;51(5):207-11. doi: 10.1080/07448480309596352.
10
Unrealistic optimism and the Health Belief Model.不切实际的乐观主义与健康信念模型。
J Behav Med. 2000 Aug;23(4):367-76. doi: 10.1023/a:1005500917875.