• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

腹侧纹状体中的奖励和虚构预测误差信号:事实和反事实处理之间的不对称性。

Reward and fictive prediction error signals in ventral striatum: asymmetry between factual and counterfactual processing.

机构信息

FIDMAG Germanes Hospitalàries Research Foundation, Carrer Antoni Pujades 38, 08830 Sant Boi de Llobregat, Barcelona, Spain.

Universitat de Barcelona, Barcelona, Spain.

出版信息

Brain Struct Funct. 2021 Jun;226(5):1553-1569. doi: 10.1007/s00429-021-02270-3. Epub 2021 Apr 11.

DOI:10.1007/s00429-021-02270-3
PMID:33839955
Abstract

Reward prediction error, the difference between the expected and obtained reward, is known to act as a reinforcement learning neural signal. In the current study, we propose a model fitting approach that combines behavioral and neural data to fit computational models of reinforcement learning. Briefly, we penalized subject-specific fitted parameters that moved away too far from the group median, except when that deviation led to an improvement in the model's fit to neural responses. By means of a probabilistic monetary learning task and fMRI, we compared our approach with standard model fitting methods. Q-learning outperformed actor-critic at both behavioral and neural level, although the inclusion of neuroimaging data into model fitting improved the fit of actor-critic models. We observed both action-value and state-value prediction error signals in the striatum, while standard model fitting approaches failed to capture state-value signals. Finally, left ventral striatum correlated with reward prediction error while right ventral striatum with fictive prediction error, suggesting a functional hemispheric asymmetry regarding prediction-error driven learning.

摘要

奖励预测误差,即预期奖励与实际获得奖励之间的差异,被认为是强化学习的神经信号。在本研究中,我们提出了一种模型拟合方法,将行为和神经数据相结合,以拟合强化学习的计算模型。简而言之,我们对偏离群体中位数太远的个体特定拟合参数进行惩罚,但当这种偏差导致模型对神经反应的拟合得到改善时除外。通过概率货币学习任务和 fMRI,我们将我们的方法与标准模型拟合方法进行了比较。在行为和神经水平上,Q-learning 都优于 actor-critic,尽管将神经影像学数据纳入模型拟合可以提高 actor-critic 模型的拟合度。我们在纹状体中观察到了动作值和状态值预测误差信号,而标准的模型拟合方法无法捕捉到状态值信号。最后,左腹侧纹状体与奖励预测误差相关,而右腹侧纹状体与虚拟预测误差相关,这表明在基于预测误差的学习方面存在功能上的半球不对称性。

相似文献

1
Reward and fictive prediction error signals in ventral striatum: asymmetry between factual and counterfactual processing.腹侧纹状体中的奖励和虚构预测误差信号:事实和反事实处理之间的不对称性。
Brain Struct Funct. 2021 Jun;226(5):1553-1569. doi: 10.1007/s00429-021-02270-3. Epub 2021 Apr 11.
2
Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI.人类大脑中脑纹状体回路中不同的预测误差介导了对状态和动作价值的学习:来自高分辨率功能磁共振成像的证据。
PLoS Comput Biol. 2017 Oct 19;13(10):e1005810. doi: 10.1371/journal.pcbi.1005810. eCollection 2017 Oct.
3
Signed Reward Prediction Errors in the Ventral Striatum Drive Episodic Memory.腹侧纹状体中的签名奖励预测误差驱动情景记忆。
J Neurosci. 2021 Feb 24;41(8):1716-1726. doi: 10.1523/JNEUROSCI.1785-20.2020. Epub 2020 Dec 17.
4
Disrupted reinforcement learning during post-error slowing in ADHD.ADHD 患者错误后减速时的强化学习中断。
PLoS One. 2019 Feb 20;14(2):e0206780. doi: 10.1371/journal.pone.0206780. eCollection 2019.
5
Reinforcement Learning Disruptions in Individuals With Depression and Sensitivity to Symptom Change Following Cognitive Behavioral Therapy.抑郁症患者的强化学习中断与认知行为治疗后症状变化的敏感性。
JAMA Psychiatry. 2021 Oct 1;78(10):1113-1122. doi: 10.1001/jamapsychiatry.2021.1844.
6
Neural mechanisms of reinforcement learning in unmedicated patients with major depressive disorder.未服药的重性抑郁障碍患者强化学习的神经机制。
Brain. 2017 Apr 1;140(4):1147-1157. doi: 10.1093/brain/awx025.
7
Impaired flexible reward learning in ADHD patients is associated with blunted reinforcement sensitivity and neural signals in ventral striatum and parietal cortex.ADHD 患者的灵活奖励学习受损与腹侧纹状体和顶叶皮层的强化敏感性和神经信号迟钝有关。
Neuroimage Clin. 2024;42:103588. doi: 10.1016/j.nicl.2024.103588. Epub 2024 Mar 1.
8
Interaction of Instrumental and Goal-Directed Learning Modulates Prediction Error Representations in the Ventral Striatum.工具性学习与目标导向学习的相互作用调节腹侧纹状体中的预测误差表征。
J Neurosci. 2016 Dec 14;36(50):12650-12660. doi: 10.1523/JNEUROSCI.1677-16.2016.
9
Differential reinforcement learning responses to positive and negative information in unmedicated individuals with depression.抑郁障碍未用药个体对正性和负性信息的差异强化学习反应。
Eur Neuropsychopharmacol. 2021 Dec;53:89-100. doi: 10.1016/j.euroneuro.2021.08.002. Epub 2021 Sep 10.
10
Congruence of Inherent and Acquired Values Facilitates Reward-Based Decision-Making.内在价值观与后天习得价值观的一致性有助于基于奖励的决策。
J Neurosci. 2016 May 4;36(18):5003-12. doi: 10.1523/JNEUROSCI.3084-15.2016.

本文引用的文献

1
Approaches to Analysis in Model-based Cognitive Neuroscience.基于模型的认知神经科学中的分析方法。
J Math Psychol. 2017 Feb;76(B):65-79. doi: 10.1016/j.jmp.2016.01.001. Epub 2016 Feb 17.
2
Impaired Expected Value Computations in Schizophrenia Are Associated With a Reduced Ability to Integrate Reward Probability and Magnitude of Recent Outcomes.精神分裂症患者的预期价值计算受损与整合近期结果的奖励概率和幅度的能力降低有关。
Biol Psychiatry Cogn Neurosci Neuroimaging. 2019 Mar;4(3):280-290. doi: 10.1016/j.bpsc.2018.11.011. Epub 2018 Dec 7.
3
Cytoarchitecture, probability maps, and functions of the human supplementary and pre-supplementary motor areas.
人脑补充运动区和预备运动区的细胞构筑、概率图和功能。
Brain Struct Funct. 2018 Dec;223(9):4169-4186. doi: 10.1007/s00429-018-1738-6. Epub 2018 Sep 5.
4
Intact Ventral Striatal Prediction Error Signaling in Medicated Schizophrenia Patients.药物治疗的精神分裂症患者腹侧纹状体预测误差信号完整
Biol Psychiatry Cogn Neurosci Neuroimaging. 2016 Sep;1(5):474-483. doi: 10.1016/j.bpsc.2016.07.007.
5
Hemispheric lateralization of resting-state functional connectivity of the ventral striatum: an exploratory study.腹侧纹状体静息态功能连接的半球偏侧化:一项探索性研究。
Brain Struct Funct. 2017 Aug;222(6):2573-2583. doi: 10.1007/s00429-016-1358-y. Epub 2017 Jan 21.
6
Altered behavioral and neural responsiveness to counterfactual gains in the elderly.老年人对反事实收益的行为和神经反应改变。
Cogn Affect Behav Neurosci. 2016 Jun;16(3):457-72. doi: 10.3758/s13415-016-0406-7.
7
Hemispheric Asymmetries in Striatal Reward Responses Relate to Approach-Avoidance Learning and Encoding of Positive-Negative Prediction Errors in Dopaminergic Midbrain Regions.纹状体奖赏反应中的半球不对称与多巴胺能中脑区域的趋近-回避学习及正负预测误差的编码有关。
J Neurosci. 2015 Oct 28;35(43):14491-500. doi: 10.1523/JNEUROSCI.1859-15.2015.
8
Counterfactual Thought.反事实思维。
Annu Rev Psychol. 2016;67:135-57. doi: 10.1146/annurev-psych-122414-033249. Epub 2015 Sep 14.
9
Contextual modulation of value signals in reward and punishment learning.语境对奖惩学习中价值信号的调节作用。
Nat Commun. 2015 Aug 25;6:8096. doi: 10.1038/ncomms9096.
10
Model-based approaches to neuroimaging: combining reinforcement learning theory with fMRI data.基于模型的神经影像学方法:将强化学习理论与 fMRI 数据相结合。
Wiley Interdiscip Rev Cogn Sci. 2010 Jul;1(4):501-510. doi: 10.1002/wcs.57. Epub 2010 Apr 2.