• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

前额皮质在分层强化学习中的神经编码。

Neuronal Encoding in Prefrontal Cortex during Hierarchical Reinforcement Learning.

机构信息

University of California at Berkeley.

出版信息

J Cogn Neurosci. 2018 Aug;30(8):1197-1208. doi: 10.1162/jocn_a_01272. Epub 2018 Apr 25.

DOI:10.1162/jocn_a_01272
PMID:29694261
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7328788/
Abstract

Reinforcement learning models have proven highly effective for understanding learning in both artificial and biological systems. However, these models have difficulty in scaling up to the complexity of real-life environments. One solution is to incorporate the hierarchical structure of behavior. In hierarchical reinforcement learning, primitive actions are chunked together into more temporally abstract actions, called "options," that are reinforced by attaining a subgoal. These subgoals are capable of generating pseudoreward prediction errors, which are distinct from reward prediction errors that are associated with the final goal of the behavior. Studies in humans have shown that pseudoreward prediction errors positively correlate with activation of ACC. To determine how pseudoreward prediction errors are encoded at the single neuron level, we trained two animals to perform a primate version of the task used to generate these errors in humans. We recorded the electrical activity of neurons in ACC during performance of this task, as well as neurons in lateral prefrontal cortex and OFC. We found that the firing rate of a small population of neurons encoded pseudoreward prediction errors, and these neurons were restricted to ACC. Our results provide support for the idea that ACC may play an important role in encoding subgoals and pseudoreward prediction errors to support hierarchical reinforcement learning. One caveat is that neurons encoding pseudoreward prediction errors were relatively few in number, especially in comparison to neurons that encoded information about the main goal of the task.

摘要

强化学习模型已被证明在理解人工和生物系统中的学习方面非常有效。然而,这些模型在扩展到真实环境的复杂性方面存在困难。一种解决方案是纳入行为的层次结构。在分层强化学习中,原始动作被组合成更具时间抽象性的动作,称为“选项”,通过实现子目标来加强。这些子目标能够产生伪奖励预测误差,与与行为的最终目标相关的奖励预测误差不同。人类研究表明,伪奖励预测误差与 ACC 的激活呈正相关。为了确定在单个神经元水平上如何对伪奖励预测误差进行编码,我们训练了两只动物来执行一项灵长类任务,该任务用于在人类中产生这些误差。我们记录了在执行此任务期间 ACC 中的神经元的电活动,以及外侧前额叶皮层和 OFC 中的神经元。我们发现一小部分神经元的放电率编码了伪奖励预测误差,这些神经元仅限于 ACC。我们的研究结果为 ACC 可能在编码子目标和伪奖励预测误差以支持分层强化学习方面发挥重要作用提供了支持。一个警告是,编码伪奖励预测误差的神经元数量相对较少,尤其是与编码任务主要目标信息的神经元相比。

相似文献

1
Neuronal Encoding in Prefrontal Cortex during Hierarchical Reinforcement Learning.前额皮质在分层强化学习中的神经编码。
J Cogn Neurosci. 2018 Aug;30(8):1197-1208. doi: 10.1162/jocn_a_01272. Epub 2018 Apr 25.
2
Neural activity ramps in frontal cortex signal extended motivation during learning.前额皮质中的神经活动逐渐增强,表明学习过程中存在持续的动机。
Elife. 2024 Jul 22;13:RP93983. doi: 10.7554/eLife.93983.
3
Chronic nicotine exposure impairs uncertainty modulation on reinforcement learning in anterior cingulate cortex and serotonin system.慢性尼古丁暴露损害了前扣带皮层和血清素系统的强化学习不确定性调节。
Neuroimage. 2018 Apr 1;169:323-333. doi: 10.1016/j.neuroimage.2017.11.048. Epub 2017 Dec 6.
4
Phase of firing coding of learning variables across the fronto-striatal network during feature-based learning.基于特征学习过程中,前额叶-纹状体网络中学习变量的放电编码阶段。
Nat Commun. 2020 Sep 16;11(1):4669. doi: 10.1038/s41467-020-18435-3.
5
Choice, uncertainty and value in prefrontal and cingulate cortex.前额叶皮质和扣带皮质中的选择、不确定性与价值
Nat Neurosci. 2008 Apr;11(4):389-97. doi: 10.1038/nn2066. Epub 2008 Mar 26.
6
Surprise signals in anterior cingulate cortex: neuronal encoding of unsigned reward prediction errors driving adjustment in behavior.前扣带皮层中的惊喜信号:无符号奖励预测误差的神经元编码驱动行为的调整。
J Neurosci. 2011 Mar 16;31(11):4178-87. doi: 10.1523/JNEUROSCI.4652-10.2011.
7
Behavioral Regulation and the Modulation of Information Coding in the Lateral Prefrontal and Cingulate Cortex.行为调节与外侧前额叶和扣带回皮质中信息编码的调制
Cereb Cortex. 2015 Sep;25(9):3197-218. doi: 10.1093/cercor/bhu114. Epub 2014 Jun 5.
8
Optimal decision making and the anterior cingulate cortex.最佳决策与前扣带回皮质
Nat Neurosci. 2006 Jul;9(7):940-7. doi: 10.1038/nn1724. Epub 2006 Jun 18.
9
Hierarchical control over foraging behavior by anterior cingulate cortex.前扣带皮层对觅食行为的层级控制。
Neurosci Biobehav Rev. 2024 May;160:105623. doi: 10.1016/j.neubiorev.2024.105623. Epub 2024 Mar 13.
10
Double dissociation of value computations in orbitofrontal and anterior cingulate neurons.眶额皮质和前扣带皮层神经元中价值计算的双重分离。
Nat Neurosci. 2011 Oct 30;14(12):1581-9. doi: 10.1038/nn.2961.

引用本文的文献

1
Unraveling the roles of spatial working memory sustained and selective neurons in prefrontal cortex.揭示前额叶皮层中空间工作记忆持续神经元和选择性神经元的作用。
Commun Biol. 2025 May 20;8(1):767. doi: 10.1038/s42003-025-08211-8.
2
Advanced Reinforcement Learning and Its Connections with Brain Neuroscience.深度强化学习及其与大脑神经科学的联系
Research (Wash D C). 2023;6:0064. doi: 10.34133/research.0064. Epub 2023 Mar 15.
3
Hierarchical Reinforcement Learning, Sequential Behavior, and the Dorsal Frontostriatal System.分层强化学习、序列行为与背侧额纹状体系统

本文引用的文献

1
Decoding subjective decisions from orbitofrontal cortex.从眶额皮质解读主观决策
Nat Neurosci. 2016 Jul;19(7):973-80. doi: 10.1038/nn.4320. Epub 2016 Jun 6.
2
Capturing the temporal evolution of choice across prefrontal cortex.捕捉前额叶皮质中选择的时间演变。
Elife. 2015 Dec 11;4:e11945. doi: 10.7554/eLife.11945.
3
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
J Cogn Neurosci. 2022 Jul 1;34(8):1307-1325. doi: 10.1162/jocn_a_01869.
4
Cognitive strategies shift information from single neurons to populations in prefrontal cortex.认知策略将信息从单个神经元转移到前额叶皮层的神经元群体。
Neuron. 2022 Feb 16;110(4):709-721.e4. doi: 10.1016/j.neuron.2021.11.021. Epub 2021 Dec 20.
5
Learning, memory and consolidation mechanisms for behavioral control in hierarchically organized cortico-basal ganglia systems.分层组织的皮质-基底神经节系统中行为控制的学习、记忆和巩固机制。
Hippocampus. 2020 Jan;30(1):73-98. doi: 10.1002/hipo.23167. Epub 2019 Oct 16.
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
4
Hierarchical control over effortful behavior by rodent medial frontal cortex: A computational model.啮齿动物内侧前额叶皮层对费力行为的分层控制:一种计算模型。
Psychol Rev. 2015 Jan;122(1):54-83. doi: 10.1037/a0038339. Epub 2014 Dec 1.
5
Medial-lateral organization of the orbitofrontal cortex.眶额皮质的内外侧组织。
J Cogn Neurosci. 2014 Jul;26(7):1347-62. doi: 10.1162/jocn_a_00573. Epub 2014 Jan 9.
6
The expected value of control: an integrative theory of anterior cingulate cortex function.控制的预期价值:前扣带皮层功能的综合理论。
Neuron. 2013 Jul 24;79(2):217-40. doi: 10.1016/j.neuron.2013.07.007.
7
Limited encoding of effort by dopamine neurons in a cost-benefit trade-off task.多巴胺神经元在成本效益权衡任务中努力程度的有限编码。
J Neurosci. 2013 May 8;33(19):8288-300. doi: 10.1523/JNEUROSCI.4619-12.2013.
8
Neural representations of events arise from temporal community structure.事件的神经表示源于时间社区结构。
Nat Neurosci. 2013 Apr;16(4):486-92. doi: 10.1038/nn.3331. Epub 2013 Feb 17.
9
Neural basis of reinforcement learning and decision making.强化学习和决策的神经基础。
Annu Rev Neurosci. 2012;35:287-308. doi: 10.1146/annurev-neuro-062111-150512. Epub 2012 Mar 29.
10
Encoding of both positive and negative reward prediction errors by neurons of the primate lateral prefrontal cortex and caudate nucleus.灵长类动物外侧前额叶皮层和尾状核神经元对正、负奖励预测误差的编码。
J Neurosci. 2011 Dec 7;31(49):17772-87. doi: 10.1523/JNEUROSCI.3793-11.2011.