Suppr超能文献

灵长类动物眶额皮层对管理探索-开发权衡相关信息的编码。

Primate Orbitofrontal Cortex Codes Information Relevant for Managing Explore-Exploit Tradeoffs.

机构信息

Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, Oregon 97239-3098, and

Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892-4415.

出版信息

J Neurosci. 2020 Mar 18;40(12):2553-2561. doi: 10.1523/JNEUROSCI.2355-19.2020. Epub 2020 Feb 14.

Abstract

Reinforcement learning (RL) refers to the behavioral process of learning to obtain reward and avoid punishment. An important component of RL is managing explore-exploit tradeoffs, which refers to the problem of choosing between exploiting options with known values and exploring unfamiliar options. We examined correlates of this tradeoff, as well as other RL related variables, in orbitofrontal cortex (OFC) while three male monkeys performed a three-armed bandit learning task. During the task, novel choice options periodically replaced familiar options. The values of the novel options were unknown, and the monkeys had to explore them to see if they were better than other currently available options. The identity of the chosen stimulus and the reward outcome were strongly encoded in the responses of single OFC neurons. These two variables define the states and state transitions in our model that are relevant to decision-making. The chosen value of the option and the relative value of exploring that option were encoded at intermediate levels. We also found that OFC value coding was stimulus specific, as opposed to coding value independent of the identity of the option. The location of the option and the value of the current environment were encoded at low levels. Therefore, we found encoding of the variables relevant to learning and managing explore-exploit tradeoffs in OFC. These results are consistent with findings in the ventral striatum and amygdala and show that this monosynaptically connected network plays an important role in learning based on the immediate and future consequences of choices. Orbitofrontal cortex (OFC) has been implicated in representing the expected values of choices. Here we extend these results and show that OFC also encodes information relevant to managing explore-exploit tradeoffs. Specifically, OFC encodes an exploration bonus, which characterizes the relative value of exploring novel choice options. OFC also strongly encodes the identity of the chosen stimulus, and reward outcomes, which are necessary for computing the value of novel and familiar options.

摘要

强化学习(RL)是指学习获得奖励和避免惩罚的行为过程。RL 的一个重要组成部分是管理探索-利用权衡,这是指在利用具有已知价值的选项和探索不熟悉的选项之间进行选择的问题。我们在三只雄性猴子执行三臂赌博学习任务时,检查了眶额皮层(OFC)中这种权衡的相关性,以及其他与 RL 相关的变量。在任务期间,新的选择选项定期替换熟悉的选项。新选项的价值是未知的,猴子必须探索它们,看看它们是否比其他当前可用的选项更好。选择的刺激和奖励结果在单个 OFC 神经元的反应中被强烈编码。这两个变量定义了我们模型中与决策相关的状态和状态转换。所选选项的价值和探索该选项的相对价值在中间水平上被编码。我们还发现,OFC 的价值编码是特定于刺激的,而不是独立于选项身份的编码。选项的位置和当前环境的价值在低水平编码。因此,我们发现 OFC 中编码了与学习和管理探索-利用权衡相关的变量。这些结果与腹侧纹状体和杏仁核的发现一致,并表明这个单突触连接的网络在基于选择的即时和未来后果的学习中起着重要作用。眶额皮层(OFC)被认为代表了选择的预期价值。在这里,我们扩展了这些结果,并表明 OFC 还编码了与管理探索-利用权衡相关的信息。具体来说,OFC 编码了探索奖金,它描述了探索新选择选项的相对价值。OFC 还强烈编码了所选刺激的身份和奖励结果,这对于计算新的和熟悉的选项的价值是必要的。

相似文献

3
The neurocomputational bases of explore-exploit decision-making.探索-利用决策的神经计算基础。
Neuron. 2022 Jun 1;110(11):1869-1879.e5. doi: 10.1016/j.neuron.2022.03.014. Epub 2022 Apr 6.
6
Subcortical Substrates of Explore-Exploit Decisions in Primates.灵长类动物探索-利用决策的皮质下基质。
Neuron. 2019 Aug 7;103(3):533-545.e5. doi: 10.1016/j.neuron.2019.05.017. Epub 2019 Jun 10.
9
Partial Adaptation to the Value Range in the Macaque Orbitofrontal Cortex.猴眶额皮质的价值范围的部分适应。
J Neurosci. 2019 May 1;39(18):3498-3513. doi: 10.1523/JNEUROSCI.2279-18.2019. Epub 2019 Mar 4.

引用本文的文献

3
Reward monitoring in the frontopolar cortex of macaques.猕猴额极皮层中的奖赏监测
Sci Rep. 2025 May 12;15(1):16472. doi: 10.1038/s41598-025-99019-3.
7
Preferences reveal dissociable encoding across prefrontal-limbic circuits.偏好揭示了前额叶-边缘回路中可分离的编码。
Neuron. 2024 Jul 3;112(13):2241-2256.e8. doi: 10.1016/j.neuron.2024.03.020. Epub 2024 Apr 18.

本文引用的文献

1
Dimensionality, information and learning in prefrontal cortex.前额叶皮层中的维度、信息和学习。
PLoS Comput Biol. 2020 Apr 24;16(4):e1007514. doi: 10.1371/journal.pcbi.1007514. eCollection 2020 Apr.
2
Orbitofrontal Circuits Control Multiple Reinforcement-Learning Processes.眶额皮质回路控制多种强化学习过程。
Neuron. 2019 Aug 21;103(4):734-746.e3. doi: 10.1016/j.neuron.2019.05.042. Epub 2019 Jun 25.
3
Subcortical Substrates of Explore-Exploit Decisions in Primates.灵长类动物探索-利用决策的皮质下基质。
Neuron. 2019 Aug 7;103(3):533-545.e5. doi: 10.1016/j.neuron.2019.05.017. Epub 2019 Jun 10.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验