Suppr超能文献

海马体重演有助于时间差分强化学习模型中的会话内学习。

Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model.

作者信息

Johnson Adam, Redish A David

机构信息

Center for Cognitive Sciences and Graduate Program in Neuroscience, University of Minnesota, MN 55455, USA.

出版信息

Neural Netw. 2005 Nov;18(9):1163-71. doi: 10.1016/j.neunet.2005.08.009. Epub 2005 Sep 29.

Abstract

Temporal difference reinforcement learning (TDRL) algorithms, hypothesized to partially explain basal ganglia functionality, learn more slowly than real animals. Modified TDRL algorithms (e.g. the Dyna-Q family) learn faster than standard TDRL by practicing experienced sequences offline. We suggest that the replay phenomenon, in which ensembles of hippocampal neurons replay previously experienced firing sequences during subsequent rest and sleep, may provide practice sequences to improve the speed of TDRL learning, even within a single session. We test the plausibility of this hypothesis in a computational model of a multiple-T choice-task. Rats show two learning rates on this task: a fast decrease in errors and a slow development of a stereotyped path. Adding developing replay to the model accelerates learning the correct path, but slows down the stereotyping of that path. These models provide testable predictions relating the effects of hippocampal inactivation as well as hippocampal replay on this task.

摘要

时间差分强化学习(TDRL)算法被假定为可部分解释基底神经节的功能,但它的学习速度比真实动物慢。改进的TDRL算法(如Dyna-Q家族)通过离线练习已有的序列,学习速度比标准TDRL更快。我们认为,海马体神经元集合在随后的休息和睡眠期间重放先前经历的放电序列的重放现象,可能提供练习序列以提高TDRL学习的速度,即使在单个会话中也是如此。我们在一个多选项任务的计算模型中测试了这一假设的合理性。大鼠在这个任务上表现出两种学习速度:错误快速减少,以及刻板路径的缓慢形成。在模型中加入逐渐发展的重放过程会加速对正确路径的学习,但会减缓该路径的刻板化。这些模型提供了可测试的预测,涉及海马体失活以及海马体重放在此任务上的影响。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验