• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain.神经预测误差揭示了人类大脑中风险敏感的强化学习过程。
J Neurosci. 2012 Jan 11;32(2):551-62. doi: 10.1523/JNEUROSCI.5498-10.2012.
2
Dynamic shaping of dopamine signals during probabilistic Pavlovian conditioning.概率性巴甫洛夫条件反射过程中多巴胺信号的动态塑造
Neurobiol Learn Mem. 2015 Jan;117:84-92. doi: 10.1016/j.nlm.2014.07.010. Epub 2014 Aug 27.
3
Neural correlates of risk prediction error during reinforcement learning in humans.人类强化学习过程中风险预测误差的神经关联
Neuroimage. 2009 Oct 1;47(4):1929-39. doi: 10.1016/j.neuroimage.2009.04.096. Epub 2009 May 13.
4
Human reinforcement learning subdivides structured action spaces by learning effector-specific values.人类强化学习通过学习特定效应器的值来细分结构化动作空间。
J Neurosci. 2009 Oct 28;29(43):13524-31. doi: 10.1523/JNEUROSCI.2469-09.2009.
5
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
6
How instructed knowledge modulates the neural systems of reward learning.指导知识如何调节奖励学习的神经系统。
Proc Natl Acad Sci U S A. 2011 Jan 4;108(1):55-60. doi: 10.1073/pnas.1014938108. Epub 2010 Dec 20.
7
Signals in human striatum are appropriate for policy update rather than value prediction.人类纹状体中的信号适合用于策略更新,而不是价值预测。
J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011.
8
Intact Reinforcement Learning But Impaired Attentional Control During Multidimensional Probabilistic Learning in Older Adults.老年人在多维概率学习中表现出完整的强化学习能力但注意力控制受损。
J Neurosci. 2020 Jan 29;40(5):1084-1096. doi: 10.1523/JNEUROSCI.0254-19.2019. Epub 2019 Dec 11.
9
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.一种具有类似多巴胺强化信号的神经网络模型,用于学习空间延迟反应任务。
Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.
10
A reinforcement learning mechanism responsible for the valuation of free choice.一种负责自由选择估值的强化学习机制。
Neuron. 2014 Aug 6;83(3):551-7. doi: 10.1016/j.neuron.2014.06.035. Epub 2014 Jul 24.

引用本文的文献

1
Latent variable sequence identification for cognitive models with neural network estimators.使用神经网络估计器的认知模型的潜在变量序列识别
Behav Res Methods. 2025 Aug 28;57(10):272. doi: 10.3758/s13428-025-02794-0.
2
Non-invasive Ultrasonic Neuromodulation of the Human Nucleus Accumbens Impacts Reward Sensitivity.对人类伏隔核进行非侵入性超声神经调节会影响奖赏敏感性。
bioRxiv. 2025 Aug 6:2024.07.25.605068. doi: 10.1101/2024.07.25.605068.
3
Data-driven equation discovery reveals nonlinear reinforcement learning in humans.数据驱动的方程发现揭示了人类的非线性强化学习。
Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2413441122. doi: 10.1073/pnas.2413441122. Epub 2025 Jul 31.
4
Experience-based risk taking is primarily shaped by prior learning rather than by decision-making.基于经验的冒险行为主要由先前的学习塑造,而非决策过程。
Nat Commun. 2025 Jul 9;16(1):6310. doi: 10.1038/s41467-025-61609-0.
5
Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning.伏隔核多巴胺释放反映了工具性学习过程中的贝叶斯推理。
PLoS Comput Biol. 2025 Jul 2;21(7):e1013226. doi: 10.1371/journal.pcbi.1013226. eCollection 2025 Jul.
6
Computational modelling and neural correlates of reinforcement learning following three-week escitalopram: a double-blind, placebo-controlled semi-randomised study.三周艾司西酞普兰治疗后强化学习的计算模型与神经关联:一项双盲、安慰剂对照半随机研究
Transl Psychiatry. 2025 May 21;15(1):175. doi: 10.1038/s41398-025-03392-6.
7
The Experience-Experience Gap: Distributional Learning Is Associated with a Divergence of Preferences from Estimations.经验-经验差距:分布学习与偏好与估计的差异相关。
Res Sq. 2025 Apr 10:rs.3.rs-6282612. doi: 10.21203/rs.3.rs-6282612/v1.
8
Distributional dual-process model predicts strategic shifts in decision-making under uncertainty.分布双过程模型预测了不确定性下决策中的策略转变。
Commun Psychol. 2025 Apr 14;3(1):61. doi: 10.1038/s44271-025-00249-y.
9
Eating disorder symptoms and emotional arousal modulate food biases during reward learning in females.饮食失调症状和情绪唤起在女性奖励学习过程中调节食物偏好。
Nat Commun. 2025 Mar 26;16(1):2938. doi: 10.1038/s41467-025-57872-w.
10
The preference for surprise in reinforcement learning underlies the differences in developmental changes in risk preference between autistic and neurotypical youth.强化学习中对意外的偏好是自闭症青年和神经典型青年在风险偏好发展变化上存在差异的基础。
Mol Autism. 2025 Jan 16;16(1):3. doi: 10.1186/s13229-025-00637-5.

本文引用的文献

1
Differentiable neural substrates for learned and described value and risk.可微分的学习价值和描述价值及风险的神经基础
Curr Biol. 2010 Oct 26;20(20):1823-9. doi: 10.1016/j.cub.2010.08.048. Epub 2010 Sep 30.
2
The description-experience gap in risky choice.风险选择中的描述-体验差距。
Trends Cogn Sci. 2009 Dec;13(12):517-23. doi: 10.1016/j.tics.2009.09.004. Epub 2009 Oct 14.
3
Bayesian t tests for accepting and rejecting the null hypothesis.用于接受和拒绝原假设的贝叶斯t检验。
Psychon Bull Rev. 2009 Apr;16(2):225-37. doi: 10.3758/PBR.16.2.225.
4
Neural response to reward anticipation under risk is nonlinear in probabilities.在风险下,神经对奖励预期的反应在概率上是非线性的。
J Neurosci. 2009 Feb 18;29(7):2231-7. doi: 10.1523/JNEUROSCI.5296-08.2009.
5
Mesolimbic functional magnetic resonance imaging activations during reward anticipation correlate with reward-related ventral striatal dopamine release.奖励预期期间的中脑边缘功能磁共振成像激活与奖励相关的腹侧纹状体多巴胺释放相关。
J Neurosci. 2008 Dec 24;28(52):14311-9. doi: 10.1523/JNEUROSCI.2058-08.2008.
6
Feedback produces divergence from prospect theory in descriptive choice.反馈在描述性选择中产生了与前景理论的偏差。
Psychol Sci. 2008 Oct;19(10):1015-22. doi: 10.1111/j.1467-9280.2008.02193.x.
7
Gambling for Gatorade: risk-sensitive decision making for fluid rewards in humans.用佳得乐饮料赌博:人类对液体奖励的风险敏感决策
Anim Cogn. 2009 Jan;12(1):201-7. doi: 10.1007/s10071-008-0186-8. Epub 2008 Aug 22.
8
Dichotomous dopaminergic control of striatal synaptic plasticity.纹状体突触可塑性的二分法多巴胺能控制
Science. 2008 Aug 8;321(5890):848-51. doi: 10.1126/science.1160575.
9
Dialogues on prediction errors.关于预测误差的对话。
Trends Cogn Sci. 2008 Jul;12(7):265-72. doi: 10.1016/j.tics.2008.03.006. Epub 2008 Jun 21.
10
Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors.区分眶额皮质和纹状体在目标价值计算和预测误差中的作用。
J Neurosci. 2008 May 28;28(22):5623-30. doi: 10.1523/JNEUROSCI.1309-08.2008.

神经预测误差揭示了人类大脑中风险敏感的强化学习过程。

Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain.

机构信息

Psychology Department and Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 08540, USA.

出版信息

J Neurosci. 2012 Jan 11;32(2):551-62. doi: 10.1523/JNEUROSCI.5498-10.2012.

DOI:10.1523/JNEUROSCI.5498-10.2012
PMID:22238090
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6621075/
Abstract

Humans and animals are exquisitely, though idiosyncratically, sensitive to risk or variance in the outcomes of their actions. Economic, psychological, and neural aspects of this are well studied when information about risk is provided explicitly. However, we must normally learn about outcomes from experience, through trial and error. Traditional models of such reinforcement learning focus on learning about the mean reward value of cues and ignore higher order moments such as variance. We used fMRI to test whether the neural correlates of human reinforcement learning are sensitive to experienced risk. Our analysis focused on anatomically delineated regions of a priori interest in the nucleus accumbens, where blood oxygenation level-dependent (BOLD) signals have been suggested as correlating with quantities derived from reinforcement learning. We first provide unbiased evidence that the raw BOLD signal in these regions corresponds closely to a reward prediction error. We then derive from this signal the learned values of cues that predict rewards of equal mean but different variance and show that these values are indeed modulated by experienced risk. Moreover, a close neurometric-psychometric coupling exists between the fluctuations of the experience-based evaluations of risky options that we measured neurally and the fluctuations in behavioral risk aversion. This suggests that risk sensitivity is integral to human learning, illuminating economic models of choice, neuroscientific models of affective learning, and the workings of the underlying neural mechanisms.

摘要

人类和动物对自身行为结果的风险或变化非常敏感,尽管这种敏感性因人而异。当明确提供有关风险的信息时,人们已经很好地研究了经济、心理和神经方面的问题。然而,我们通常必须通过试错从经验中了解结果。这种强化学习的传统模型主要关注学习线索的平均奖励值,而忽略方差等更高阶矩。我们使用 fMRI 来测试人类强化学习的神经相关性是否对经验风险敏感。我们的分析主要集中在前扣带皮层的预先确定的感兴趣的解剖区域,在该区域中,血氧水平依赖(BOLD)信号已被建议与从强化学习中得出的数量相关。我们首先提供了无偏的证据,表明这些区域中的原始 BOLD 信号与奖励预测误差密切相关。然后,我们从该信号中推导出预测具有相同均值但不同方差的奖励的线索的学习值,结果表明这些值确实受到经验风险的调节。此外,我们在神经上测量的风险选择的基于经验的评估的波动与行为风险厌恶的波动之间存在紧密的神经计量-心理计量耦合。这表明风险敏感性是人类学习的一个组成部分,阐明了选择的经济模型、情感学习的神经科学模型以及潜在神经机制的工作原理。