• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

不确定性下决策中奖惩的差异效应:一项计算研究。

Differential effects of reward and punishment in decision making under uncertainty: a computational study.

机构信息

School of Computing, University of Leeds Leeds, West Yorkshire, UK.

Neuroscience and Psychiatry Unit, University of Manchester Manchester, UK.

出版信息

Front Neurosci. 2014 Feb 21;8:30. doi: 10.3389/fnins.2014.00030. eCollection 2014.

DOI:10.3389/fnins.2014.00030
PMID:24600342
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3930867/
Abstract

Computational models of learning have proved largely successful in characterizing potential mechanisms which allow humans to make decisions in uncertain and volatile contexts. We report here findings that extend existing knowledge and show that a modified reinforcement learning model, which has separate parameters according to whether the previous trial gave a reward or a punishment, can provide the best fit to human behavior in decision making under uncertainty. More specifically, we examined the fit of our modified reinforcement learning model to human behavioral data in a probabilistic two-alternative decision making task with rule reversals. Our results demonstrate that this model predicted human behavior better than a series of other models based on reinforcement learning or Bayesian reasoning. Unlike the Bayesian models, our modified reinforcement learning model does not include any representation of rule switches. When our task is considered purely as a machine learning task, to gain as many rewards as possible without trying to describe human behavior, the performance of modified reinforcement learning and Bayesian methods is similar. Others have used various computational models to describe human behavior in similar tasks, however, we are not aware of any who have compared Bayesian reasoning with reinforcement learning modified to differentiate rewards and punishments.

摘要

学习的计算模型在刻画潜在机制方面已被证明取得了很大的成功,这些机制使人类能够在不确定和不稳定的环境下做出决策。我们在此报告的研究结果扩展了现有知识,表明一个经过修改的强化学习模型,根据前一次试验是奖励还是惩罚,具有不同的参数,可以为不确定条件下的决策提供最佳的人类行为拟合。更具体地说,我们在具有规则反转的概率性二选一决策任务中,检验了我们的修改强化学习模型对人类行为数据的拟合程度。我们的结果表明,该模型比基于强化学习或贝叶斯推理的一系列其他模型更好地预测了人类行为。与贝叶斯模型不同,我们的修改强化学习模型不包括任何规则转换的表示。当我们的任务纯粹被视为机器学习任务,为了尽可能多地获得奖励而不试图描述人类行为时,修改强化学习和贝叶斯方法的性能相似。其他人使用各种计算模型来描述类似任务中的人类行为,但是,我们不知道有谁将贝叶斯推理与修改后的强化学习进行了比较,以区分奖励和惩罚。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/6a611752ef81/fnins-08-00030-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/616f46202b49/fnins-08-00030-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/7375fd62d440/fnins-08-00030-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/77d8932e41e7/fnins-08-00030-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/e915fbd09e88/fnins-08-00030-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/34f3afb1e1a6/fnins-08-00030-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/eaf40d74a86a/fnins-08-00030-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/fbcfe3bc58d4/fnins-08-00030-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/7311ff8d90a4/fnins-08-00030-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/6a611752ef81/fnins-08-00030-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/616f46202b49/fnins-08-00030-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/7375fd62d440/fnins-08-00030-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/77d8932e41e7/fnins-08-00030-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/e915fbd09e88/fnins-08-00030-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/34f3afb1e1a6/fnins-08-00030-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/eaf40d74a86a/fnins-08-00030-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/fbcfe3bc58d4/fnins-08-00030-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/7311ff8d90a4/fnins-08-00030-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c768/3930867/6a611752ef81/fnins-08-00030-g0009.jpg

相似文献

1
Differential effects of reward and punishment in decision making under uncertainty: a computational study.不确定性下决策中奖惩的差异效应:一项计算研究。
Front Neurosci. 2014 Feb 21;8:30. doi: 10.3389/fnins.2014.00030. eCollection 2014.
2
Contingency-based flexibility mechanisms through a reinforcement learning model in adults with attention-deficit/hyperactivity disorder and obsessive-compulsive disorder.通过强化学习模型在患有注意力缺陷多动障碍和强迫症的成年人中构建基于权变的灵活性机制。
Compr Psychiatry. 2025 May;139:152589. doi: 10.1016/j.comppsych.2025.152589. Epub 2025 Mar 13.
3
Association of Environmental Uncertainty With Altered Decision-making and Learning Mechanisms in Youths With Obsessive-Compulsive Disorder.环境不确定性与强迫症青少年决策和学习机制改变的关联。
JAMA Netw Open. 2021 Nov 1;4(11):e2136195. doi: 10.1001/jamanetworkopen.2021.36195.
4
The influence of trial order on learning from reward vs. punishment in a probabilistic categorization task: experimental and computational analyses.概率分类任务中试验顺序对从奖励与惩罚中学习的影响:实验与计算分析
Front Behav Neurosci. 2015 Jul 24;9:153. doi: 10.3389/fnbeh.2015.00153. eCollection 2015.
5
Probabilistic reward- and punishment-based learning in opioid addiction: Experimental and computational data.阿片类药物成瘾中基于概率奖惩的学习:实验与计算数据
Behav Brain Res. 2016 Jan 1;296:240-248. doi: 10.1016/j.bbr.2015.09.018. Epub 2015 Sep 14.
6
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
7
Modulation of value-based decision making behavior by subregions of the rat prefrontal cortex.大鼠前额皮质亚区对基于价值的决策行为的调节。
Psychopharmacology (Berl). 2020 May;237(5):1267-1280. doi: 10.1007/s00213-020-05454-7. Epub 2020 Feb 6.
8
People teach with rewards and punishments as communication, not reinforcements.人们通过奖惩进行教学,而不是通过强化物进行沟通。
J Exp Psychol Gen. 2019 Mar;148(3):520-549. doi: 10.1037/xge0000569.
9
Impaired adaptation of learning to contingency volatility in internalizing psychopathology.内化性精神病理学中学习对关联性波动性适应能力受损。
Elife. 2020 Dec 22;9:e61387. doi: 10.7554/eLife.61387.
10
Reinforcement Learning in Patients With Mood and Anxiety Disorders vs Control Individuals: A Systematic Review and Meta-analysis.心境和焦虑障碍患者与对照个体的强化学习:系统评价和荟萃分析。
JAMA Psychiatry. 2022 Apr 1;79(4):313-322. doi: 10.1001/jamapsychiatry.2022.0051.

引用本文的文献

1
Modeling Search Behaviors during the Acquisition of Expertise in a Sequential Decision-Making Task.在顺序决策任务中获取专业知识过程中的搜索行为建模
Front Comput Neurosci. 2017 Sep 8;11:80. doi: 10.3389/fncom.2017.00080. eCollection 2017.

本文引用的文献

1
Loss-aversion or loss-attention: the impact of losses on cognitive performance.损失规避或损失注意:损失对认知表现的影响。
Cogn Psychol. 2013 Mar;66(2):212-31. doi: 10.1016/j.cogpsych.2012.12.001. Epub 2013 Jan 19.
2
Losses as modulators of attention: review and analysis of the unique effects of losses over gains.损失作为注意力的调节剂:对损失相对于收益的独特影响的回顾与分析。
Psychol Bull. 2013 Mar;139(2):497-518. doi: 10.1037/a0029383. Epub 2012 Jul 23.
3
Different varieties of uncertainty in human decision-making.人类决策中不同种类的不确定性。
Front Neurosci. 2012 Jun 8;6:85. doi: 10.3389/fnins.2012.00085. eCollection 2012.
4
Go and no-go learning in reward and punishment: interactions between affect and effect.在奖惩中进行的趋近-回避学习:情感与效果的相互作用。
Neuroimage. 2012 Aug 1;62(1):154-66. doi: 10.1016/j.neuroimage.2012.04.024. Epub 2012 Apr 21.
5
Distinct roles for direct and indirect pathway striatal neurons in reinforcement.纹状体直接和间接通路神经元在强化中的不同作用。
Nat Neurosci. 2012 Jun;15(6):816-8. doi: 10.1038/nn.3100.
6
The effect of explanation in simple binary decision tasks.解释在简单二元决策任务中的作用。
Q J Exp Psychol (Hove). 2012;65(7):1361-75. doi: 10.1080/17470218.2012.656664. Epub 2012 Apr 12.
7
Inferring relevance in a changing world.推断变化世界中的相关性。
Front Hum Neurosci. 2012 Jan 24;5:189. doi: 10.3389/fnhum.2011.00189. eCollection 2011.
8
Electrophysiological correlates of decision making under varying levels of uncertainty.在不同不确定性水平下决策的电生理相关性。
Brain Res. 2011 Oct 12;1417:55-66. doi: 10.1016/j.brainres.2011.08.031. Epub 2011 Aug 22.
9
Computational models of reinforcement learning: the role of dopamine as a reward signal.强化学习的计算模型:多巴胺作为奖励信号的作用。
Cogn Neurodyn. 2010 Jun;4(2):91-105. doi: 10.1007/s11571-010-9109-x. Epub 2010 Mar 21.
10
Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings.风险、意外不确定性和估计不确定性:不稳定环境下的贝叶斯学习。
PLoS Comput Biol. 2011 Jan 20;7(1):e1001048. doi: 10.1371/journal.pcbi.1001048.