纹状体假性奖赏预测误差对基于价值的决策的贡献。

The contribution of striatal pseudo-reward prediction errors to value-based decision-making.

机构信息

Montreal Neurological Institute, McGill University, Montreal, QC, H3A 2B4, Canada.

Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands; Lyon Neuroscience Research Center - INSERM U1028 - CNRS UMR5292, PSYR2 Team, University of Lyon, Lyon, France.

出版信息

Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7.

DOI:10.1016/j.neuroimage.2019.02.052

PMID:30851446

Abstract

Most studies that have investigated the brain mechanisms underlying learning have focused on the ability to learn simple stimulus-response associations. However, in everyday life, outcomes are often obtained through complex behavioral patterns involving a series of actions. Parallel learning systems might be important to reduce the complexity of the learning problem in such scenarios, as proposed in the framework of hierarchical reinforcement learning (HRL). The key feature of HRL is the decomposition of complex sets of action into subgoals. These subgoals are associated with the computation of pseudo-reward prediction errors (PRPEs), which allow the reinforcement of actions that led to a subgoal before the final goal itself is achieved. Here we wanted to test the hypothesis that, despite not carrying any rewarding value per se, pseudo-rewards might generate a bias in choice behavior in the absence of any advantage. Second, we also hypothesized that this bias might be related to the strength of PRPE striatal representations. In order to test these ideas, we developed a novel decision-making paradigm to assess reward prediction errors (RPEs) and PRPEs in two studies (fMRI study: n = 20; behavioral study: n = 19). Our results show that the participants developed a preference for the most pseudo-rewarding option throughout the task, even though it did not lead to more monetary rewards. fMRI analyses revealed that this preference was predicted by individual differences in the relative striatal sensitivity to PRPEs vs RPEs. Together, our results indicate that pseudo-rewards generate learning signals in the striatum and subsequently bias choice behavior despite their lack of association with actual reward.

摘要

大多数研究大脑学习机制的研究都集中在学习简单刺激-反应关联的能力上。然而，在日常生活中，结果通常是通过涉及一系列行动的复杂行为模式获得的。平行学习系统可能对于减少此类情况下学习问题的复杂性很重要，如分层强化学习 (HRL) 框架所提出的那样。HRL 的关键特征是将复杂的动作集分解为子目标。这些子目标与伪奖励预测误差 (PRPE) 的计算相关联，这允许在最终目标本身实现之前强化导致子目标的动作。在这里，我们想测试这样一个假设，即尽管本身没有任何奖励价值，但在没有任何优势的情况下，伪奖励可能会在选择行为中产生偏差。其次，我们还假设这种偏差可能与 PRPE 纹状体表示的强度有关。为了检验这些想法，我们在两项研究中开发了一种新的决策制定范式来评估奖励预测误差 (RPE) 和 PRPE（功能磁共振成像研究：n=20；行为研究：n=19）。我们的结果表明，参与者在整个任务中都表现出对最伪奖励选项的偏好，尽管它并没有带来更多的金钱奖励。功能磁共振成像分析表明，这种偏好可以通过个体对 PRPE 与 RPE 的纹状体敏感性的差异来预测。总之，我们的研究结果表明，尽管伪奖励与实际奖励没有关联，但它们会在纹状体中产生学习信号，随后会影响选择行为。

相似文献

The contribution of striatal pseudo-reward prediction errors to value-based decision-making.

Neuroimage. 2019 Jun;193:67-74. doi: 10.1016/j.neuroimage.2019.02.052. Epub 2019 Mar 7.

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making.

J Neurosci. 2007 Nov 21;27(47):12860-7. doi: 10.1523/JNEUROSCI.2496-07.2007.

Credit Assignment in a Motor Decision Making Task Is Influenced by Agency and Not Sensory Prediction Errors.

J Neurosci. 2018 May 9;38(19):4521-4530. doi: 10.1523/JNEUROSCI.3601-17.2018. Epub 2018 Apr 12.

The Attraction Effect Modulates Reward Prediction Errors and Intertemporal Choices.

J Neurosci. 2017 Jan 11;37(2):371-382. doi: 10.1523/JNEUROSCI.2532-16.2016.

Signals in human striatum are appropriate for policy update rather than value prediction.

J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011.

Human dorsal striatal activity during choice discriminates reinforcement learning behavior from the gambler's fallacy.

J Neurosci. 2011 Apr 27;31(17):6296-304. doi: 10.1523/JNEUROSCI.6421-10.2011.

Neural Signatures of Prediction Errors in a Decision-Making Task Are Modulated by Action Execution Failures.

Curr Biol. 2019 May 20;29(10):1606-1613.e5. doi: 10.1016/j.cub.2019.04.011. Epub 2019 May 2.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning.

Neural Netw. 2006 Oct;19(8):1242-54. doi: 10.1016/j.neunet.2006.06.007. Epub 2006 Sep 20.

Beta Oscillations in Monkey Striatum Encode Reward Prediction Error Signals.

J Neurosci. 2023 May 3;43(18):3339-3352. doi: 10.1523/JNEUROSCI.0952-22.2023. Epub 2023 Apr 4.

引用本文的文献

The neural correlates of memory integration in value-based decision-making during human spatial navigation.

Neuropsychologia. 2024 Jan 29;193:108758. doi: 10.1016/j.neuropsychologia.2023.108758. Epub 2023 Dec 14.

Distinct reinforcement learning profiles distinguish between language and attentional neurodevelopmental disorders.

Behav Brain Funct. 2023 Mar 21;19(1):6. doi: 10.1186/s12993-023-00207-w.

Category learning in a recurrent neural network with reinforcement learning.

Front Psychiatry. 2022 Oct 25;13:1008011. doi: 10.3389/fpsyt.2022.1008011. eCollection 2022.

The Role of Executive Function in Shaping Reinforcement Learning.

Curr Opin Behav Sci. 2021 Apr;38:66-73. doi: 10.1016/j.cobeha.2020.10.003. Epub 2020 Nov 14.

Unraveling the Temporal Dynamics of Reward Signals in Music-Induced Pleasure with TMS.

J Neurosci. 2021 Apr 28;41(17):3889-3899. doi: 10.1523/JNEUROSCI.0727-20.2020. Epub 2021 Mar 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

纹状体假性奖赏预测误差对基于价值的决策的贡献。

The contribution of striatal pseudo-reward prediction errors to value-based decision-making.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献