• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

决策中价值表征的保真度。

Fidelity of the representation of value in decision-making.

作者信息

Bays Paul M, Dowding Ben A

机构信息

University of Cambridge, Department of Psychology, Cambridge, United Kingdom.

出版信息

PLoS Comput Biol. 2017 Mar 1;13(3):e1005405. doi: 10.1371/journal.pcbi.1005405. eCollection 2017 Mar.

DOI:10.1371/journal.pcbi.1005405
PMID:28248958
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5352141/
Abstract

The ability to make optimal decisions depends on evaluating the expected rewards associated with different potential actions. This process is critically dependent on the fidelity with which reward value information can be maintained in the nervous system. Here we directly probe the fidelity of value representation following a standard reinforcement learning task. The results demonstrate a previously-unrecognized bias in the representation of value: extreme reward values, both low and high, are stored significantly more accurately and precisely than intermediate rewards. The symmetry between low and high rewards pertained despite substantially higher frequency of exposure to high rewards, resulting from preferential exploitation of more rewarding options. The observed variation in fidelity of value representation retrospectively predicted performance on the reinforcement learning task, demonstrating that the bias in representation has an impact on decision-making. A second experiment in which one or other extreme-valued option was omitted from the learning sequence showed that representational fidelity is primarily determined by the relative position of an encoded value on the scale of rewards experienced during learning. Both variability and guessing decreased with the reduction in the number of options, consistent with allocation of a limited representational resource. These findings have implications for existing models of reward-based learning, which typically assume defectless representation of reward value.

摘要

做出最优决策的能力取决于对与不同潜在行动相关的预期奖励进行评估。这个过程严重依赖于奖励价值信息在神经系统中得以维持的保真度。在此,我们通过一项标准强化学习任务直接探究价值表征的保真度。结果表明在价值表征中存在一种此前未被认识到的偏差:极低和极高的极端奖励值比中等奖励值存储得显著更准确、更精确。尽管由于对更具奖励性的选项进行优先利用,高奖励出现的频率大幅更高,但低奖励和高奖励之间仍存在这种对称性。所观察到的价值表征保真度变化可回顾性地预测强化学习任务中的表现,表明表征偏差对决策有影响。在第二个实验中,从学习序列中省略一个或另一个极端值选项,结果表明表征保真度主要由编码值在学习期间所经历的奖励尺度上的相对位置决定。随着选项数量的减少,变异性和猜测都降低了,这与有限表征资源的分配一致。这些发现对现有的基于奖励的学习模型有影响,这些模型通常假定奖励价值的表征完美无缺。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/5eb79a848ce6/pcbi.1005405.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/8b9118e7690e/pcbi.1005405.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/61fa37f4f840/pcbi.1005405.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/fe41c86c8356/pcbi.1005405.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/76242884ea86/pcbi.1005405.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/7023a7340330/pcbi.1005405.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/9ba52d3f4422/pcbi.1005405.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/5eb79a848ce6/pcbi.1005405.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/8b9118e7690e/pcbi.1005405.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/61fa37f4f840/pcbi.1005405.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/fe41c86c8356/pcbi.1005405.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/76242884ea86/pcbi.1005405.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/7023a7340330/pcbi.1005405.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/9ba52d3f4422/pcbi.1005405.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4cf/5352141/5eb79a848ce6/pcbi.1005405.g007.jpg

相似文献

1
Fidelity of the representation of value in decision-making.决策中价值表征的保真度。
PLoS Comput Biol. 2017 Mar 1;13(3):e1005405. doi: 10.1371/journal.pcbi.1005405. eCollection 2017 Mar.
2
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
3
Reactivation of Reward-Related Patterns from Single Past Episodes Supports Memory-Based Decision Making.来自单个过往事件的奖励相关模式的重新激活支持基于记忆的决策。
J Neurosci. 2016 Mar 9;36(10):2868-80. doi: 10.1523/JNEUROSCI.3433-15.2016.
4
The contribution of striatal pseudo-reward prediction errors to value-based decision-making.纹状体假性奖赏预测误差对基于价值的决策的贡献。
Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7.
5
Cost-benefit trade-offs in decision-making and learning.决策和学习中的成本效益权衡。
PLoS Comput Biol. 2019 Sep 6;15(9):e1007326. doi: 10.1371/journal.pcbi.1007326. eCollection 2019 Sep.
6
Model-based reinforcement learning under concurrent schedules of reinforcement in rodents.啮齿动物在并发强化程序下基于模型的强化学习
Learn Mem. 2009 Apr 29;16(5):315-23. doi: 10.1101/lm.1295509. Print 2009 May.
7
The Computational Development of Reinforcement Learning during Adolescence.青少年时期强化学习的计算发展
PLoS Comput Biol. 2016 Jun 20;12(6):e1004953. doi: 10.1371/journal.pcbi.1004953. eCollection 2016 Jun.
8
A reinforcement learning diffusion decision model for value-based decisions.基于价值的决策的强化学习扩散决策模型。
Psychon Bull Rev. 2019 Aug;26(4):1099-1121. doi: 10.3758/s13423-018-1554-2.
9
Frontal, Striatal, and Medial Temporal Sensitivity to Value Distinguishes Risk-Taking from Risk-Aversive Older Adults during Decision Making.额叶、纹状体和内侧颞叶对价值的敏感性在决策过程中区分了冒险型与规避风险型老年人。
J Neurosci. 2016 Dec 7;36(49):12498-12509. doi: 10.1523/JNEUROSCI.1386-16.2016.
10
Individual differences and the neural representations of reward expectation and reward prediction error.个体差异与奖励预期和奖励预测误差的神经表现。
Soc Cogn Affect Neurosci. 2007 Mar;2(1):20-30. doi: 10.1093/scan/nsl021.

引用本文的文献

1
Suboptimality in perceptual decision making.感知决策中的次优性。
Behav Brain Sci. 2018 Feb 27;41:e223. doi: 10.1017/S0140525X18000936.

本文引用的文献

1
Economic irrationality is optimal during noisy decision making.在有噪声的决策过程中,经济非理性是最优的。
Proc Natl Acad Sci U S A. 2016 Mar 15;113(11):3102-7. doi: 10.1073/pnas.1519157113. Epub 2016 Feb 29.
2
Noise in neural populations accounts for errors in working memory.神经群体中的噪声导致工作记忆错误。
J Neurosci. 2014 Mar 5;34(10):3632-45. doi: 10.1523/JNEUROSCI.3204-13.2014.
3
Changing concepts of working memory.工作记忆概念的变化。
Nat Neurosci. 2014 Mar;17(3):347-56. doi: 10.1038/nn.3655. Epub 2014 Feb 25.
4
Remembering the best and worst of times: memories for extreme outcomes bias risky decisions.铭记最好与最坏的时刻:极端结果的记忆会使风险决策产生偏差。
Psychon Bull Rev. 2014 Jun;21(3):629-36. doi: 10.3758/s13423-013-0542-9.
5
Prediction, postdiction, and perceptual length contraction: a bayesian low-speed prior captures the cutaneous rabbit and related illusions.预测、回溯和知觉长度收缩:贝叶斯低速先验捕获了皮肤兔子和相关错觉。
Front Psychol. 2013 May 10;4:221. doi: 10.3389/fpsyg.2013.00221. eCollection 2013.
6
Normalization is a general neural mechanism for context-dependent decision making.归一化是一种用于上下文相关决策的通用神经机制。
Proc Natl Acad Sci U S A. 2013 Apr 9;110(15):6139-44. doi: 10.1073/pnas.1217854110. Epub 2013 Mar 25.
7
Value normalization in decision making: theory and evidence.决策中的价值归一化:理论与证据。
Curr Opin Neurobiol. 2012 Dec;22(6):970-81. doi: 10.1016/j.conb.2012.07.011. Epub 2012 Aug 29.
8
A range-normalization model of context-dependent choice: a new model and evidence.语境相关选择的范围归一化模型:一种新模型与证据。
PLoS Comput Biol. 2012;8(7):e1002607. doi: 10.1371/journal.pcbi.1002607. Epub 2012 Jul 19.
9
Efficient coding and the neural representation of value.高效编码与价值的神经表示。
Ann N Y Acad Sci. 2012 Mar;1251:13-32. doi: 10.1111/j.1749-6632.2012.06496.x.
10
Salience driven value integration explains decision biases and preference reversal.突显驱动的价值整合解释了决策偏差和偏好反转。
Proc Natl Acad Sci U S A. 2012 Jun 12;109(24):9659-64. doi: 10.1073/pnas.1119569109. Epub 2012 May 25.