• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

纹状体假性奖赏预测误差对基于价值的决策的贡献。

The contribution of striatal pseudo-reward prediction errors to value-based decision-making.

机构信息

Montreal Neurological Institute, McGill University, Montreal, QC, H3A 2B4, Canada.

Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands; Lyon Neuroscience Research Center - INSERM U1028 - CNRS UMR5292, PSYR2 Team, University of Lyon, Lyon, France.

出版信息

Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7.

DOI:10.1016/j.neuroimage.2019.02.052
PMID:30851446
Abstract

Most studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. Parallel learning systems might be important to reduce the complexity of the learning problem in such scenarios, as proposed in the framework of hierarchical reinforcement learning (HRL). The key feature of HRL is the decomposition of complex sets of action into subgoals. These subgoals are associated with the computation of pseudo-reward prediction errors (PRPEs), which allow the reinforcement of actions that led to a subgoal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior in the absence of any advantage. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioral study: n = 19). Our results show that the participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

摘要

大多数研究大脑学习机制的研究都集中在学习简单刺激-反应关联的能力上。然而,在日常生活中,结果通常是通过涉及一系列行动的复杂行为模式获得的。平行学习系统可能对于减少此类情况下学习问题的复杂性很重要,如分层强化学习 (HRL) 框架所提出的那样。HRL 的关键特征是将复杂的动作集分解为子目标。这些子目标与伪奖励预测误差 (PRPE) 的计算相关联,这允许在最终目标本身实现之前强化导致子目标的动作。在这里,我们想测试这样一个假设,即尽管本身没有任何奖励价值,但在没有任何优势的情况下,伪奖励可能会在选择行为中产生偏差。其次,我们还假设这种偏差可能与 PRPE 纹状体表示的强度有关。为了检验这些想法,我们在两项研究中开发了一种新的决策制定范式来评估奖励预测误差 (RPE) 和 PRPE(功能磁共振成像研究:n=20;行为研究:n=19)。我们的结果表明,参与者在整个任务中都表现出对最伪奖励选项的偏好,尽管它并没有带来更多的金钱奖励。功能磁共振成像分析表明,这种偏好可以通过个体对 PRPE 与 RPE 的纹状体敏感性的差异来预测。总之,我们的研究结果表明,尽管伪奖励与实际奖励没有关联,但它们会在纹状体中产生学习信号,随后会影响选择行为。

相似文献

1
The contribution of striatal pseudo-reward prediction errors to value-based decision-making.纹状体假性奖赏预测误差对基于价值的决策的贡献。
Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7.
2
Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making.在基于奖励的决策过程中,人类纹状体中的强化学习信号可区分学习者和非学习者。
J Neurosci. 2007 Nov 21;27(47):12860-7. doi: 10.1523/JNEUROSCI.2496-07.2007.
3
Credit Assignment in a Motor Decision Making Task Is Influenced by Agency and Not Sensory Prediction Errors.在一项运动决策任务中,信用分配受机构影响,而不受感官预测误差影响。
J Neurosci. 2018 May 9;38(19):4521-4530. doi: 10.1523/JNEUROSCI.3601-17.2018. Epub 2018 Apr 12.
4
The Attraction Effect Modulates Reward Prediction Errors and Intertemporal Choices.吸引力效应调节奖励预测误差和跨期选择。
J Neurosci. 2017 Jan 11;37(2):371-382. doi: 10.1523/JNEUROSCI.2532-16.2016.
5
Signals in human striatum are appropriate for policy update rather than value prediction.人类纹状体中的信号适合用于策略更新,而不是价值预测。
J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011.
6
Human dorsal striatal activity during choice discriminates reinforcement learning behavior from the gambler's fallacy.人类背侧纹状体在选择过程中的活动可以区分强化学习行为和赌徒谬误。
J Neurosci. 2011 Apr 27;31(17):6296-304. doi: 10.1523/JNEUROSCI.6421-10.2011.
7
Neural Signatures of Prediction Errors in a Decision-Making Task Are Modulated by Action Execution Failures.在一项决策任务中,预测错误的神经特征受动作执行失败的调节。
Curr Biol. 2019 May 20;29(10):1606-1613.e5. doi: 10.1016/j.cub.2019.04.011. Epub 2019 May 2.
8
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
9
Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning.用于整合多个皮质-纹状体环路的异层级强化学习模型:刺激-动作-奖励关联学习中的功能磁共振成像检查
Neural Netw. 2006 Oct;19(8):1242-54. doi: 10.1016/j.neunet.2006.06.007. Epub 2006 Sep 20.
10
Beta Oscillations in Monkey Striatum Encode Reward Prediction Error Signals.猴子纹状体中的β振荡编码奖励预测误差信号。
J Neurosci. 2023 May 3;43(18):3339-3352. doi: 10.1523/JNEUROSCI.0952-22.2023. Epub 2023 Apr 4.

引用本文的文献

1
The neural correlates of memory integration in value-based decision-making during human spatial navigation.基于价值的决策中人类空间导航时记忆整合的神经关联。
Neuropsychologia. 2024 Jan 29;193:108758. doi: 10.1016/j.neuropsychologia.2023.108758. Epub 2023 Dec 14.
2
Distinct reinforcement learning profiles distinguish between language and attentional neurodevelopmental disorders.不同的强化学习特征可区分语言和注意力神经发育障碍。
Behav Brain Funct. 2023 Mar 21;19(1):6. doi: 10.1186/s12993-023-00207-w.
3
Category learning in a recurrent neural network with reinforcement learning.
基于强化学习的循环神经网络中的类别学习。
Front Psychiatry. 2022 Oct 25;13:1008011. doi: 10.3389/fpsyt.2022.1008011. eCollection 2022.
4
The Role of Executive Function in Shaping Reinforcement Learning.执行功能在塑造强化学习中的作用。
Curr Opin Behav Sci. 2021 Apr;38:66-73. doi: 10.1016/j.cobeha.2020.10.003. Epub 2020 Nov 14.
5
Unraveling the Temporal Dynamics of Reward Signals in Music-Induced Pleasure with TMS.用 TMS 揭示音乐诱导愉悦中奖励信号的时间动态。
J Neurosci. 2021 Apr 28;41(17):3889-3899. doi: 10.1523/JNEUROSCI.0727-20.2020. Epub 2021 Mar 29.