• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用基于模型的双向搜索对清醒海马体重新激活进行建模。

Modeling awake hippocampal reactivations with model-based bidirectional search.

作者信息

Khamassi Mehdi, Girard Benoît

机构信息

Institute of Intelligent Systems and Robotics (ISIR), Sorbonne Université and CNRS (Centre National de la Recherche Scientifique), 75005, Paris, France.

出版信息

Biol Cybern. 2020 Apr;114(2):231-248. doi: 10.1007/s00422-020-00817-x. Epub 2020 Feb 17.

DOI:10.1007/s00422-020-00817-x
PMID:32065253
Abstract

Hippocampal offline reactivations during reward-based learning, usually categorized as replay events, have been found to be important for performance improvement over time and for memory consolidation. Recent computational work has linked these phenomena to the need to transform reward information into state-action values for decision making and to propagate it to all relevant states of the environment. Nevertheless, it is still unclear whether an integrated reinforcement learning mechanism could account for the variety of awake hippocampal reactivations, including variety in order (forward and reverse reactivated trajectories) and variety in the location where they occur (reward site or decision-point). Here, we present a model-based bidirectional search model which accounts for a variety of hippocampal reactivations. The model combines forward trajectory sampling from current position and backward sampling through prioritized sweeping from states associated with large reward prediction errors until the two trajectories connect. This is repeated until stabilization of state-action values (convergence), which could explain why hippocampal reactivations drastically diminish when the animal's performance stabilizes. Simulations in a multiple T-maze task show that forward reactivations are prominently found at decision-points while backward reactivations are exclusively generated at reward sites. Finally, the model can generate imaginary trajectories that are not allowed to the agent during task performance. We raise some experimental predictions and implications for future studies of the role of the hippocampo-prefronto-striatal network in learning.

摘要

在基于奖励的学习过程中,海马体的离线再激活通常被归类为回放事件,已被发现对于随着时间推移提高表现以及记忆巩固很重要。最近的计算工作已将这些现象与将奖励信息转化为用于决策的状态-动作值并将其传播到环境的所有相关状态的需求联系起来。然而,尚不清楚一种整合的强化学习机制是否能够解释清醒时海马体再激活的多样性,包括顺序的多样性(正向和反向再激活轨迹)以及它们发生位置的多样性(奖励位点或决策点)。在此,我们提出一种基于模型的双向搜索模型,该模型可以解释多种海马体再激活现象。该模型结合了从当前位置进行的正向轨迹采样以及通过从与大奖励预测误差相关的状态进行优先扫描的反向采样,直到两条轨迹连接。重复此过程直到状态-动作值稳定(收敛),这可以解释为什么当动物的表现稳定时海马体再激活会急剧减少。在多重T型迷宫任务中的模拟表明,正向再激活主要出现在决策点,而反向再激活仅在奖励位点产生。最后,该模型可以生成在任务执行期间主体不被允许的虚构轨迹。我们提出了一些实验预测以及对未来关于海马体-前额叶-纹状体网络在学习中的作用研究的启示。

相似文献

1
Modeling awake hippocampal reactivations with model-based bidirectional search.使用基于模型的双向搜索对清醒海马体重新激活进行建模。
Biol Cybern. 2020 Apr;114(2):231-248. doi: 10.1007/s00422-020-00817-x. Epub 2020 Feb 17.
2
Real-time sensory-motor integration of hippocampal place cell replay and prefrontal sequence learning in simulated and physical rat robots for novel path optimization.模拟和物理大鼠机器人中海马位置细胞重放和前额叶序列学习的实时感觉运动整合,用于新路径优化。
Biol Cybern. 2020 Apr;114(2):249-268. doi: 10.1007/s00422-020-00820-2. Epub 2020 Feb 24.
3
Task Demands Predict a Dynamic Switch in the Content of Awake Hippocampal Replay.任务需求预测清醒海马体重演内容中的动态转换。
Neuron. 2017 Nov 15;96(4):925-935.e6. doi: 10.1016/j.neuron.2017.09.035. Epub 2017 Oct 19.
4
Distinct effects of reward and navigation history on hippocampal forward and reverse replays.奖励和导航历史对海马体前向和反向重播的不同影响。
Proc Natl Acad Sci U S A. 2020 Jan 7;117(1):689-697. doi: 10.1073/pnas.1912533117. Epub 2019 Dec 23.
5
Awake Hippocampal-Cortical Co-reactivation Is Associated with Forgetting.清醒状态下海马-皮层的协同再激活与遗忘有关。
J Cogn Neurosci. 2023 Sep 1;35(9):1446-1462. doi: 10.1162/jocn_a_02021.
6
Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model.海马体重演有助于时间差分强化学习模型中的会话内学习。
Neural Netw. 2005 Nov;18(9):1163-71. doi: 10.1016/j.neunet.2005.08.009. Epub 2005 Sep 29.
7
Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward.海马位置细胞的反向重放受奖励变化的独特调节。
Neuron. 2016 Sep 7;91(5):1124-1136. doi: 10.1016/j.neuron.2016.07.047. Epub 2016 Aug 25.
8
Observational learning promotes hippocampal remote awake replay toward future reward locations.观察性学习促进了海马体对未来奖励位置的远程清醒重放。
Neuron. 2022 Mar 2;110(5):891-902.e7. doi: 10.1016/j.neuron.2021.12.005. Epub 2021 Dec 28.
9
Offline replay supports planning in human reinforcement learning.离线重放支持人类强化学习中的规划。
Elife. 2018 Dec 14;7:e32548. doi: 10.7554/eLife.32548.
10
Enhancement of Hippocampal Spatial Decoding Using a Dynamic Q-Learning Method With a Relative Reward Using Theta Phase Precession.使用基于 theta 相位进动的相对奖励的动态 Q 学习方法增强海马体空间解码。
Int J Neural Syst. 2020 Sep;30(9):2050048. doi: 10.1142/S0129065720500483. Epub 2020 Aug 12.

引用本文的文献

1
An Improved Dyna-Q Algorithm Inspired by the Forward Prediction Mechanism in the Rat Brain for Mobile Robot Path Planning.一种受大鼠大脑前向预测机制启发的改进型Dyna-Q算法用于移动机器人路径规划
Biomimetics (Basel). 2024 May 23;9(6):315. doi: 10.3390/biomimetics9060315.
2
A model of hippocampal replay driven by experience and environmental structure facilitates spatial learning.经验和环境结构驱动的海马体重放模型促进了空间学习。
Elife. 2023 Mar 14;12:e82301. doi: 10.7554/eLife.82301.
3
Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics.
神经机器人学中强化学习的基于模型和无模型回放机制
Front Neurorobot. 2022 Jun 24;16:864380. doi: 10.3389/fnbot.2022.864380. eCollection 2022.
4
Reward prediction errors drive declarative learning irrespective of agency.奖励预测误差驱动陈述性学习,而与主体无关。
Psychon Bull Rev. 2021 Dec;28(6):2045-2056. doi: 10.3758/s13423-021-01952-7. Epub 2021 Jun 15.
5
From spatial navigation via visual construction to episodic memory and imagination.从通过视觉构建进行空间导航到情景记忆和想象。
Biol Cybern. 2020 Apr;114(2):139-167. doi: 10.1007/s00422-020-00829-7. Epub 2020 Apr 13.