• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

海马体重演有助于时间差分强化学习模型中的会话内学习。

Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model.

作者信息

Johnson Adam, Redish A David

机构信息

Center for Cognitive Sciences and Graduate Program in Neuroscience, University of Minnesota, MN 55455, USA.

出版信息

Neural Netw. 2005 Nov;18(9):1163-71. doi: 10.1016/j.neunet.2005.08.009. Epub 2005 Sep 29.

DOI:10.1016/j.neunet.2005.08.009
PMID:16198539
Abstract

Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.

摘要

时间差分强化学习(TDRL)算法被假定为可部分解释基底神经节的功能,但它的学习速度比真实动物慢。改进的TDRL算法(如Dyna-Q家族)通过离线练习已有的序列,学习速度比标准TDRL更快。我们认为,海马体神经元集合在随后的休息和睡眠期间重放先前经历的放电序列的重放现象,可能提供练习序列以提高TDRL学习的速度,即使在单个会话中也是如此。我们在一个多选项任务的计算模型中测试了这一假设的合理性。大鼠在这个任务上表现出两种学习速度:错误快速减少,以及刻板路径的缓慢形成。在模型中加入逐渐发展的重放过程会加速对正确路径的学习,但会减缓该路径的刻板化。这些模型提供了可测试的预测,涉及海马体失活以及海马体重放在此任务上的影响。

相似文献

1
Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model.海马体重演有助于时间差分强化学习模型中的会话内学习。
Neural Netw. 2005 Nov;18(9):1163-71. doi: 10.1016/j.neunet.2005.08.009. Epub 2005 Sep 29.
2
A model of hippocampally dependent navigation, using the temporal difference learning rule.一种使用时间差分学习规则的海马体依赖性导航模型。
Hippocampus. 2000;10(1):1-16. doi: 10.1002/(SICI)1098-1063(2000)10:1<1::AID-HIPO1>3.0.CO;2-1.
3
Reverse replay of behavioural sequences in hippocampal place cells during the awake state.清醒状态下海马位置细胞中行为序列的反向重演
Nature. 2006 Mar 30;440(7084):680-3. doi: 10.1038/nature04587. Epub 2006 Feb 12.
4
Hippocampal replay is not a simple function of experience.海马体重放并不是简单的经验功能。
Neuron. 2010 Mar 11;65(5):695-705. doi: 10.1016/j.neuron.2010.01.034.
5
Addiction as a computational process gone awry.成瘾是一个出了差错的计算过程。
Science. 2004 Dec 10;306(5703):1944-7. doi: 10.1126/science.1102384.
6
Parallel processing across neural systems: implications for a multiple memory system hypothesis.跨神经系统的并行处理:对多重记忆系统假说的启示。
Neurobiol Learn Mem. 2004 Nov;82(3):278-98. doi: 10.1016/j.nlm.2004.07.007.
7
Modeling awake hippocampal reactivations with model-based bidirectional search.使用基于模型的双向搜索对清醒海马体重新激活进行建模。
Biol Cybern. 2020 Apr;114(2):231-248. doi: 10.1007/s00422-020-00817-x. Epub 2020 Feb 17.
8
Reward value invariant place responses and reward site associated activity in hippocampal neurons of behaving rats.行为大鼠海马神经元中奖励价值不变的位置反应及奖励位点相关活动
Hippocampus. 2003;13(1):117-32. doi: 10.1002/hipo.10056.
9
Reactivation of behavioral activity during sharp waves: a computational model for two stage hippocampal dynamics.尖波期间行为活动的重新激活:一种两阶段海马体动力学的计算模型。
Hippocampus. 2007;17(3):201-9. doi: 10.1002/hipo.20258.
10
Repeated acquisition and performance chamber for mice: a paradigm for assessment of spatial learning and memory.小鼠重复获取与行为测试箱:一种评估空间学习与记忆的范式。
Neurobiol Learn Mem. 2000 Nov;74(3):241-58. doi: 10.1006/nlme.1999.3951.

引用本文的文献

1
Global remapping emerges as the mechanism for renewal of context-dependent behavior in a reinforcement learning model.全局重映射成为强化学习模型中依赖情境行为更新的机制。
Front Comput Neurosci. 2025 Jan 15;18:1462110. doi: 10.3389/fncom.2024.1462110. eCollection 2024.
2
Dorsal hippocampus represents locations to avoid as well as locations to approach during approach-avoidance conflict.在趋避冲突中,背侧海马体既代表要避开的位置,也代表要接近的位置。
PLoS Biol. 2025 Jan 14;23(1):e3002954. doi: 10.1371/journal.pbio.3002954. eCollection 2025 Jan.
3
Dorsal hippocampus represents locations to avoid as well as locations to approach during approach-avoidance conflict.
背侧海马体代表了在趋避冲突中需要避开的位置以及需要接近的位置。
bioRxiv. 2024 Mar 12:2024.03.10.584295. doi: 10.1101/2024.03.10.584295.
4
Role of cerebellum in sleep-dependent memory processes.小脑在睡眠依赖的记忆过程中的作用。
Front Syst Neurosci. 2023 Apr 18;17:1154489. doi: 10.3389/fnsys.2023.1154489. eCollection 2023.
5
A model of hippocampal replay driven by experience and environmental structure facilitates spatial learning.经验和环境结构驱动的海马体重放模型促进了空间学习。
Elife. 2023 Mar 14;12:e82301. doi: 10.7554/eLife.82301.
6
A computational model of learning flexible navigation in a maze by layout-conforming replay of place cells.一种通过位置细胞的布局符合式重放来学习迷宫中灵活导航的计算模型。
Front Comput Neurosci. 2023 Feb 9;17:1053097. doi: 10.3389/fncom.2023.1053097. eCollection 2023.
7
How our understanding of memory replay evolves.记忆回放的理解是如何发展的。
J Neurophysiol. 2023 Mar 1;129(3):552-580. doi: 10.1152/jn.00454.2022. Epub 2023 Feb 8.
8
Humans account for cognitive costs when finding shortcuts: An information-theoretic analysis of navigation.人类在寻找捷径时会考虑认知成本:对导航的信息论分析。
PLoS Comput Biol. 2023 Jan 6;19(1):e1010829. doi: 10.1371/journal.pcbi.1010829. eCollection 2023 Jan.
9
Sampling motion trajectories during hippocampal theta sequences.采样海马体 theta 序列期间的运动轨迹。
Elife. 2022 Nov 8;11:e74058. doi: 10.7554/eLife.74058.
10
Learning Structures: Predictive Representations, Replay, and Generalization.学习结构:预测性表征、回放与泛化。
Curr Opin Behav Sci. 2020 Apr;32:155-166. doi: 10.1016/j.cobeha.2020.02.017. Epub 2020 May 5.