灵长类动物外侧前额叶皮层和尾状核神经元对正、负奖励预测误差的编码。

Encoding of both positive and negative reward prediction errors by neurons of the primate lateral prefrontal cortex and caudate nucleus.

机构信息

Rhodan Center for Nervous System Repair, Department of Neurosurgery, Massachusetts General Hospital, and Harvard Medical School, Boston, Massachusetts 02114, USA.

出版信息

J Neurosci. 2011 Dec 7;31(49):17772-87. doi: 10.1523/JNEUROSCI.3793-11.2011.

DOI:10.1523/JNEUROSCI.3793-11.2011

PMID:22159094

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3266530/

Abstract

Learning can be motivated by unanticipated success or unexpected failure. The former encourages us to repeat an action or activity, whereas the latter leads us to find an alternative strategy. Understanding the neural representation of these unexpected events is therefore critical to elucidate learning-related circuits. We examined the activity of neurons in the lateral prefrontal cortex (PFC) and caudate nucleus of monkeys as they performed a trial-and-error learning task. Unexpected outcomes were widely represented in both structures, and neurons driven by unexpectedly negative outcomes were as frequent as those activated by unexpectedly positive outcomes. Moreover, both positive and negative reward prediction errors (RPEs) were represented primarily by increases in firing rate, unlike the manner in which dopamine neurons have been observed to reflect these values. Interestingly, positive RPEs tended to appear with shorter latency than negative RPEs, perhaps reflecting the mechanism of their generation. Last, in the PFC but not the caudate, trial-by-trial variations in outcome-related activity were linked to the animals' subsequent behavioral decisions. More broadly, the robustness of RPE signaling by these neurons suggests that actor-critic models of reinforcement learning in which the PFC and particularly the caudate are considered primarily to be "actors" rather than "critics," should be reconsidered to include a prominent evaluative role for these structures.

摘要

学习可以受到意外成功或意外失败的激励。前者鼓励我们重复一个动作或活动，而后者则促使我们寻找替代策略。因此，理解这些意外事件的神经表示对于阐明与学习相关的回路至关重要。我们观察了猴子外侧前额叶皮层 (PFC) 和尾状核中神经元在进行试错学习任务时的活动。在这两个结构中，意外结果都得到了广泛的表示，并且由意外负面结果驱动的神经元与由意外正面结果驱动的神经元一样频繁。此外，无论是正的还是负的奖励预测误差 (RPE) 主要都表现为放电率的增加，这与观察到多巴胺神经元反映这些值的方式不同。有趣的是，正 RPE 似乎比负 RPE 出现的潜伏期更短，这可能反映了它们产生的机制。最后，在 PFC 中而不是尾状核中，与结果相关的活动在试验中的变化与动物随后的行为决策有关。更广泛地说，这些神经元的 RPE 信号的稳健性表明，强化学习的行为-评价模型应该重新考虑，其中 PFC 特别是尾状核被认为主要是“行为者”而不是“评价者”，以包括这些结构的突出评价作用。

相似文献

Encoding of both positive and negative reward prediction errors by neurons of the primate lateral prefrontal cortex and caudate nucleus.灵长类动物外侧前额叶皮层和尾状核神经元对正、负奖励预测误差的编码。

J Neurosci. 2011 Dec 7;31(49):17772-87. doi: 10.1523/JNEUROSCI.3793-11.2011.

Action and outcome encoding in the primate caudate nucleus.灵长类动物尾状核中的动作与结果编码

J Neurosci. 2007 Dec 26;27(52):14502-14. doi: 10.1523/JNEUROSCI.3060-07.2007.

Lateral habenula neurons signal errors in the prediction of reward information.外侧缰核神经元传递预测奖赏信息时的错误信号。

Nat Neurosci. 2011 Aug 21;14(9):1209-16. doi: 10.1038/nn.2902.

Tonic or Phasic Stimulation of Dopaminergic Projections to Prefrontal Cortex Causes Mice to Maintain or Deviate from Previously Learned Behavioral Strategies.对前额叶皮层多巴胺能投射的强直或相位刺激使小鼠维持或偏离先前习得的行为策略。

J Neurosci. 2017 Aug 30;37(35):8315-8329. doi: 10.1523/JNEUROSCI.1221-17.2017. Epub 2017 Jul 24.

Representation of outcome risk and action in the anterior caudate nucleus.前尾状核中的结果风险和行动表示。

J Neurosci. 2014 Feb 26;34(9):3279-90. doi: 10.1523/JNEUROSCI.3818-13.2014.

Prefrontal and anterior cingulate cortex neurons encode attentional targets even when they do not apparently bias behavior.前额叶和前扣带回皮层神经元编码注意力目标，即使它们显然没有对行为产生偏向作用。

J Neurophysiol. 2016 Aug 1;116(2):796-811. doi: 10.1152/jn.00027.2016. Epub 2016 May 18.

Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task.灵长类动物在执行奖励偏好任务期间背外侧和眶额叶前额叶皮层的神经元活动。

Eur J Neurosci. 2003 Oct;18(7):2069-81. doi: 10.1046/j.1460-9568.2003.02922.x.

A neural correlate of response bias in monkey caudate nucleus.猴子尾状核中反应偏差的神经关联。

Nature. 2002 Jul 25;418(6896):413-7. doi: 10.1038/nature00892.

Immediate changes in anticipatory activity of caudate neurons associated with reversal of position-reward contingency.与位置-奖励关联逆转相关的尾状核神经元预期活动的即时变化。

J Neurophysiol. 2005 Sep;94(3):1879-87. doi: 10.1152/jn.00012.2005. Epub 2005 May 4.

Reward prediction based on stimulus categorization in primate lateral prefrontal cortex.基于灵长类动物外侧前额叶皮层刺激分类的奖励预测

Nat Neurosci. 2008 Jun;11(6):703-12. doi: 10.1038/nn.2128. Epub 2008 May 25.

引用本文的文献

Basal ganglia activation localized in MEG using a reward task.使用奖励任务在脑磁图中定位基底神经节激活。

Neuroimage Rep. 2021 Jul 28;1(3):100034. doi: 10.1016/j.ynirp.2021.100034. eCollection 2021 Sep.

Prior Expectations of Volatility Following Psychotherapy for Delusions: A Randomized Clinical Trial.妄想症心理治疗后波动性的先前预期：一项随机临床试验。

JAMA Netw Open. 2025 Jun 2;8(6):e2517132. doi: 10.1001/jamanetworkopen.2025.17132.

Distributed representations of temporally accumulated reward prediction errors in the mouse cortex.小鼠皮层中时间累积奖励预测误差的分布式表征。

Sci Adv. 2025 Jan 24;11(4):eadi4782. doi: 10.1126/sciadv.adi4782. Epub 2025 Jan 22.

Neurons of Macaque Frontal Eye Field Signal Reward-Related Surprise.猴额眼区神经元信号传递与奖赏相关的意外信息。

J Neurosci. 2024 Sep 18;44(38):e0441242024. doi: 10.1523/JNEUROSCI.0441-24.2024.

Corticostriatal beta oscillation changes associated with cognitive function in Parkinson's disease.皮质纹状体β振荡变化与帕金森病认知功能的关系。

Brain. 2023 Sep 1;146(9):3662-3675. doi: 10.1093/brain/awad206.

State-specific alterations in the neural computations underlying inhibitory control in women remitted from bulimia nervosa.神经性贪食症缓解期女性抑制控制背后神经计算的特定状态改变。

Mol Psychiatry. 2023 Jul;28(7):3055-3062. doi: 10.1038/s41380-023-02063-6. Epub 2023 Apr 27.

Beta Oscillations in Monkey Striatum Encode Reward Prediction Error Signals.猴子纹状体中的β振荡编码奖励预测误差信号。

J Neurosci. 2023 May 3;43(18):3339-3352. doi: 10.1523/JNEUROSCI.0952-22.2023. Epub 2023 Apr 4.

The Neurobase of ambiguity loss aversion about decision making.决策中模糊性损失厌恶的神经基础。

Front Psychol. 2023 Jan 26;14:1055640. doi: 10.3389/fpsyg.2023.1055640. eCollection 2023.

Phase of firing coding of learning variables across the fronto-striatal network during feature-based learning.基于特征学习过程中，前额叶-纹状体网络中学习变量的放电编码阶段。

Nat Commun. 2020 Sep 16;11(1):4669. doi: 10.1038/s41467-020-18435-3.

Minocycline differentially modulates human spatial memory systems.米诺环素对人类空间记忆系统有不同的调节作用。

Neuropsychopharmacology. 2020 Dec;45(13):2162-2169. doi: 10.1038/s41386-020-00811-8. Epub 2020 Aug 24.

本文引用的文献

Surprise signals in anterior cingulate cortex: neuronal encoding of unsigned reward prediction errors driving adjustment in behavior.前扣带皮层中的惊喜信号：无符号奖励预测误差的神经元编码驱动行为的调整。

J Neurosci. 2011 Mar 16;31(11):4178-87. doi: 10.1523/JNEUROSCI.4652-10.2011.

Reward prediction error coding in dorsal striatal neurons.背侧纹状体神经元中的奖励预测误差编码。

J Neurosci. 2010 Aug 25;30(34):11447-57. doi: 10.1523/JNEUROSCI.1719-10.2010.

Role of striatum in updating values of chosen actions.纹状体在更新所选动作价值中的作用。

J Neurosci. 2009 Nov 25;29(47):14701-12. doi: 10.1523/JNEUROSCI.2728-09.2009.

Ventral striatal neurons encode the value of the chosen action in rats deciding between differently delayed or sized rewards.在大鼠决定选择不同延迟或大小的奖励时，腹侧纹状体神经元对所选行动的价值进行编码。

J Neurosci. 2009 Oct 21;29(42):13365-76. doi: 10.1523/JNEUROSCI.2572-09.2009.

Learning substrates in the primate prefrontal cortex and striatum: sustained activity related to successful actions.灵长类前额叶皮层和纹状体中的学习基质：与成功行为相关的持续活动。

Neuron. 2009 Jul 30;63(2):244-53. doi: 10.1016/j.neuron.2009.06.019.

Evaluating choices by single neurons in the frontal lobe: outcome value encoded across multiple decision variables.通过额叶中的单个神经元评估选择：跨多个决策变量编码的结果值。

Eur J Neurosci. 2009 May;29(10):2061-73. doi: 10.1111/j.1460-9568.2009.06743.x. Epub 2009 May 11.

Two types of dopamine neuron distinctly convey positive and negative motivational signals.两种类型的多巴胺神经元分别传递积极和消极的动机信号。

Nature. 2009 Jun 11;459(7248):837-41. doi: 10.1038/nature08028. Epub 2009 May 17.

Neuronal correlates of instrumental learning in the dorsal striatum.背侧纹状体中工具性学习的神经元关联

J Neurophysiol. 2009 Jul;102(1):475-89. doi: 10.1152/jn.00262.2009. Epub 2009 May 13.

Phasic firing in dopaminergic neurons is sufficient for behavioral conditioning.多巴胺能神经元的相位性放电足以进行行为条件反射。

Science. 2009 May 22;324(5930):1080-4. doi: 10.1126/science.1168878. Epub 2009 Apr 23.

Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials.中脑多巴胺能神经元和纹状体胆碱能中间神经元在概率性经典条件反射试验的不同阶段编码奖励与厌恶事件之间的差异。

J Neurosci. 2008 Nov 5;28(45):11673-84. doi: 10.1523/JNEUROSCI.3839-08.2008.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验