沿海马长轴的差异强化编码有助于解决探索-利用困境。

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

机构信息

Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, 15213, USA.

Department of Psychology, Penn State University, University Park, PA, 16801, USA.

出版信息

Nat Commun. 2020 Oct 26;11(1):5407. doi: 10.1038/s41467-020-18864-0.

DOI:10.1038/s41467-020-18864-0

PMID:33106508

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7589536/

Abstract

When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.

摘要

在做决策时，是应该利用已知的好选项，还是应该探索潜在的更好选择？对非结构化空间选项的探索依赖于大脑的新皮层、纹状体和杏仁核。然而，在自然环境中，更好的选择往往聚集在一起，形成结构化的价值分布。海马体将奖励信息绑定到以自我为中心的认知地图中，以支持在这种空间中的导航和觅食。在这里，我们报告说，在具有空间结构奖励功能的强化学习任务中，人类的后海马体（PH）激发了探索，而前海马体（AH）支持向利用的转变。这些动态取决于 PH 和 AH 中的差异强化表示。虽然局部奖励预测误差信号在 PH 尾部较早且呈阶段性，但全局价值最大值信号在 AH 体部延迟且持续。AH 跨情节压缩强化信息，更新价值最大值的位置和突出性，并在向其导航时显示类似目标细胞的斜升活动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f93/7589536/38c0434ae016/41467_2020_18864_Fig1_HTML.jpg

相似文献

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.沿海马长轴的差异强化编码有助于解决探索-利用困境。

Nat Commun. 2020 Oct 26;11(1):5407. doi: 10.1038/s41467-020-18864-0.

Generalization of value in reinforcement learning by humans.人类在强化学习中的价值泛化。

Eur J Neurosci. 2012 Apr;35(7):1092-104. doi: 10.1111/j.1460-9568.2012.08017.x.

Selective maintenance of value information helps resolve the exploration/exploitation dilemma.选择性地保留有价值的信息有助于解决探索/开发困境。

Cognition. 2019 Feb;183:226-243. doi: 10.1016/j.cognition.2018.11.004. Epub 2018 Nov 28.

The neurocomputational bases of explore-exploit decision-making.探索-利用决策的神经计算基础。

Neuron. 2022 Jun 1;110(11):1869-1879.e5. doi: 10.1016/j.neuron.2022.03.014. Epub 2022 Apr 6.

How pupil responses track value-based decision-making during and after reinforcement learning.瞳孔反应如何在强化学习期间和之后跟踪基于价值的决策。

PLoS Comput Biol. 2018 Nov 30;14(11):e1006632. doi: 10.1371/journal.pcbi.1006632. eCollection 2018 Nov.

Multiple memory systems as substrates for multiple decision systems.多种记忆系统作为多种决策系统的基础。

Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.

Hippocampal lesions facilitate instrumental learning with delayed reinforcement but induce impulsive choice in rats.海马体损伤有助于大鼠进行延迟强化的工具性学习，但会导致其做出冲动选择。

BMC Neurosci. 2005 May 13;6:36. doi: 10.1186/1471-2202-6-36.

Exploration versus exploitation decisions in the human brain: A systematic review of functional neuroimaging and neuropsychological studies.人类大脑中的探索与开发决策：功能神经影像学和神经心理学研究的系统综述。

Neuropsychologia. 2024 Jan 10;192:108740. doi: 10.1016/j.neuropsychologia.2023.108740. Epub 2023 Nov 29.

Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task.人类在认知限制下适应性地解决探索-利用困境：来自多臂赌博机任务的证据。

Cognition. 2022 Dec;229:105233. doi: 10.1016/j.cognition.2022.105233. Epub 2022 Jul 30.

An Upside to Reward Sensitivity: The Hippocampus Supports Enhanced Reinforcement Learning in Adolescence.奖励敏感性的好处：海马体支持青少年增强的强化学习。

Neuron. 2016 Oct 5;92(1):93-99. doi: 10.1016/j.neuron.2016.08.031.

引用本文的文献

Spatial navigation strategy in older adults: Preference or ability?老年人的空间导航策略：偏好还是能力？

Psychol Aging. 2025 Apr 21. doi: 10.1037/pag0000896.

Prefrontal default-mode network interactions with posterior hippocampus during exploration.探索过程中前额叶默认模式网络与后海马体的相互作用。

bioRxiv. 2025 Mar 13:2025.03.12.642890. doi: 10.1101/2025.03.12.642890.

Striatal arbitration between choice strategies guides few-shot adaptation.选择策略之间的纹状体仲裁引导少样本适应。

Nat Commun. 2025 Feb 20;16(1):1811. doi: 10.1038/s41467-025-57049-5.

Memory consolidation from a reinforcement learning perspective.从强化学习角度看记忆巩固。

Front Comput Neurosci. 2025 Jan 8;18:1538741. doi: 10.3389/fncom.2024.1538741. eCollection 2024.

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.基于奖励的选项竞争在人类背侧流中，以及在连续空间中从随机探索到利用的转变。

Sci Adv. 2024 Feb 23;10(8):eadj2219. doi: 10.1126/sciadv.adj2219.

Effects of childhood maltreatment and major depressive disorder on functional connectivity in hippocampal subregions.童年期虐待和重性抑郁障碍对海马亚区功能连接的影响。

Brain Imaging Behav. 2024 Jun;18(3):598-611. doi: 10.1007/s11682-024-00859-w. Epub 2024 Feb 7.

A New Paradigm for the Study of Cognitive Flexibility in Children and Adolescents: The "Virtual House Locomotor Maze" (VHLM).儿童和青少年认知灵活性研究的新范式：“虚拟房屋运动迷宫”（VHLM）。

Front Psychiatry. 2021 Sep 23;12:708378. doi: 10.3389/fpsyt.2021.708378. eCollection 2021.

From exploration to exploitation: a shifting mental mode in late life development.从探索到开发：晚年发展中思维模式的转变。

Trends Cogn Sci. 2021 Dec;25(12):1058-1071. doi: 10.1016/j.tics.2021.09.001. Epub 2021 Sep 27.

Search for solutions, learning, simulation, and choice processes in suicidal behavior.寻找自杀行为中的解决办法、学习、模拟和选择过程。

Wiley Interdiscip Rev Cogn Sci. 2022 Jan;13(1):e1561. doi: 10.1002/wcs.1561. Epub 2021 May 18.

本文引用的文献

The value of what's to come: Neural mechanisms coupling prediction error and the utility of anticipation.未来事物的价值：将预测误差与预期效用相耦合的神经机制。

Sci Adv. 2020 Jun 19;6(25):eaba3828. doi: 10.1126/sciadv.aba3828. eCollection 2020 Jun.

Searching for Rewards Like a Child Means Less Generalization and More Directed Exploration.像孩子一样寻找奖励意味着更少的泛化和更多的有针对性探索。

Psychol Sci. 2019 Nov;30(11):1561-1572. doi: 10.1177/0956797619863663. Epub 2019 Oct 25.

Human Replay Spontaneously Reorganizes Experience.人类的重放会自发地重组经验。

Cell. 2019 Jul 25;178(3):640-652.e14. doi: 10.1016/j.cell.2019.06.012. Epub 2019 Jul 4.

Subcortical Substrates of Explore-Exploit Decisions in Primates.灵长类动物探索-利用决策的皮质下基质。

Neuron. 2019 Aug 7;103(3):533-545.e5. doi: 10.1016/j.neuron.2019.05.017. Epub 2019 Jun 10.

Generalization guides human exploration in vast decision spaces.泛化指导人类在广阔的决策空间中进行探索。

Nat Hum Behav. 2018 Dec;2(12):915-924. doi: 10.1038/s41562-018-0467-4. Epub 2018 Nov 12.

The Hippocampus Encodes Distances in Multidimensional Feature Space.海马体在多维特征空间中编码距离。

Curr Biol. 2019 Apr 1;29(7):1226-1231.e3. doi: 10.1016/j.cub.2019.02.035. Epub 2019 Mar 21.

Hippocampal Contributions to Model-Based Planning and Spatial Memory.海马体对基于模型的规划和空间记忆的贡献。

Neuron. 2019 May 8;102(3):683-693.e4. doi: 10.1016/j.neuron.2019.02.014. Epub 2019 Mar 11.

Involvement of hippocampal subfields and anterior-posterior subregions in encoding and retrieval of item, spatial, and associative memories: Longitudinal versus transverse axis.海马亚区和前后亚区在项目、空间和联想记忆的编码和检索中的作用：长轴与短轴。

Neuroimage. 2019 May 1;191:568-586. doi: 10.1016/j.neuroimage.2019.01.061. Epub 2019 Feb 8.

The ventral hippocampus is required for behavioral flexibility but not for allocentric/egocentric learning.腹侧海马体对于行为灵活性是必需的，但对于以自我为中心的/非自我为中心的学习则不是必需的。

Brain Res Bull. 2019 Mar;146:40-50. doi: 10.1016/j.brainresbull.2018.12.011. Epub 2018 Dec 26.

Selective maintenance of value information helps resolve the exploration/exploitation dilemma.选择性地保留有价值的信息有助于解决探索/开发困境。

Cognition. 2019 Feb;183:226-243. doi: 10.1016/j.cognition.2018.11.004. Epub 2018 Nov 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

沿海马长轴的差异强化编码有助于解决探索-利用困境。

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献