Suppr超能文献

记忆转换增强动态环境中的强化学习。

Memory Transformation Enhances Reinforcement Learning in Dynamic Environments.

作者信息

Santoro Adam, Frankland Paul W, Richards Blake A

机构信息

Institute of Medical Sciences, University of Toronto, Toronto, Ontario M5S 1AB, Canada.

Program in Neurosciences and Mental Health, Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada.

出版信息

J Neurosci. 2016 Nov 30;36(48):12228-12242. doi: 10.1523/JNEUROSCI.0763-16.2016.

Abstract

UNLABELLED

Over the course of systems consolidation, there is a switch from a reliance on detailed episodic memories to generalized schematic memories. This switch is sometimes referred to as "memory transformation." Here we demonstrate a previously unappreciated benefit of memory transformation, namely, its ability to enhance reinforcement learning in a dynamic environment. We developed a neural network that is trained to find rewards in a foraging task where reward locations are continuously changing. The network can use memories for specific locations (episodic memories) and statistical patterns of locations (schematic memories) to guide its search. We find that switching from an episodic to a schematic strategy over time leads to enhanced performance due to the tendency for the reward location to be highly correlated with itself in the short-term, but regress to a stable distribution in the long-term. We also show that the statistics of the environment determine the optimal utilization of both types of memory. Our work recasts the theoretical question of why memory transformation occurs, shifting the focus from the avoidance of memory interference toward the enhancement of reinforcement learning across multiple timescales.

SIGNIFICANCE STATEMENT

As time passes, memories transform from a highly detailed state to a more gist-like state, in a process called "memory transformation." Theories of memory transformation speak to its advantages in terms of reducing memory interference, increasing memory robustness, and building models of the environment. However, the role of memory transformation from the perspective of an agent that continuously acts and receives reward in its environment is not well explored. In this work, we demonstrate a view of memory transformation that defines it as a way of optimizing behavior across multiple timescales.

摘要

未标注

在系统巩固过程中,存在从依赖详细的情景记忆向概括性的图式记忆的转变。这种转变有时被称为“记忆转换”。在此,我们展示了记忆转换一个此前未被认识到的益处,即它在动态环境中增强强化学习的能力。我们开发了一个神经网络,该网络在觅食任务中接受训练以寻找奖励,其中奖励位置不断变化。该网络可以利用特定位置的记忆(情景记忆)和位置的统计模式(图式记忆)来指导其搜索。我们发现,随着时间的推移从情景策略转换为图式策略会导致性能提升,这是因为奖励位置在短期内倾向于与其自身高度相关,但在长期内会回归到稳定分布。我们还表明,环境的统计特性决定了这两种记忆类型的最佳利用方式。我们的工作重新诠释了记忆转换为何会发生的理论问题,将重点从避免记忆干扰转向在多个时间尺度上增强强化学习。

意义声明

随着时间的推移,记忆会从高度详细的状态转变为更具梗概性的状态,这一过程被称为“记忆转换”。记忆转换理论阐述了其在减少记忆干扰、提高记忆稳健性以及构建环境模型方面的优势。然而,从在其环境中持续行动并接收奖励的智能体的角度来看,记忆转换的作用尚未得到充分探索。在这项工作中,我们展示了一种记忆转换的观点,将其定义为一种在多个时间尺度上优化行为的方式。

相似文献

1
Memory Transformation Enhances Reinforcement Learning in Dynamic Environments.
J Neurosci. 2016 Nov 30;36(48):12228-12242. doi: 10.1523/JNEUROSCI.0763-16.2016.
2
Modeling the role of working memory and episodic memory in behavioral tasks.
Hippocampus. 2008;18(2):193-209. doi: 10.1002/hipo.20382.
3
The hippocampus and related neocortical structures in memory transformation.
Neurosci Lett. 2018 Jul 27;680:39-53. doi: 10.1016/j.neulet.2018.05.006. Epub 2018 May 4.
4
Reinstated episodic context guides sampling-based decisions for reward.
Nat Neurosci. 2017 Jul;20(7):997-1003. doi: 10.1038/nn.4573. Epub 2017 Jun 5.
5
Episodic memories predict adaptive value-based decision-making.
J Exp Psychol Gen. 2016 May;145(5):548-558. doi: 10.1037/xge0000158. Epub 2016 Mar 21.
6
A computational theory of episodic memory formation in the hippocampus.
Behav Brain Res. 2010 Dec 31;215(2):180-96. doi: 10.1016/j.bbr.2010.03.027. Epub 2010 Mar 20.
8
Dopaminergic inputs in the dentate gyrus direct the choice of memory encoding.
Proc Natl Acad Sci U S A. 2016 Sep 13;113(37):E5501-10. doi: 10.1073/pnas.1606951113. Epub 2016 Aug 29.
9
Memory Reactivation Enables Long-Term Prevention of Interference.
Curr Biol. 2017 May 22;27(10):1529-1534.e2. doi: 10.1016/j.cub.2017.04.025. Epub 2017 May 11.
10
Episodic memory governs choices: An RNN-based reinforcement learning model for decision-making task.
Neural Netw. 2021 Feb;134:1-10. doi: 10.1016/j.neunet.2020.11.003. Epub 2020 Nov 18.

引用本文的文献

2
Reinforcement learning increasingly relates to memory specificity from childhood to adulthood.
Nat Commun. 2025 Apr 30;16(1):4074. doi: 10.1038/s41467-025-59379-w.
3
Humans forage for reward in reinforcement learning tasks.
bioRxiv. 2025 Mar 7:2024.07.08.602539. doi: 10.1101/2024.07.08.602539.
4
A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation.
Proc Natl Acad Sci U S A. 2022 Nov;119(44):e2123432119. doi: 10.1073/pnas.2123432119. Epub 2022 Oct 24.
5
A neurobiological perspective on social influence: Serotonin and social adaptation.
J Neurochem. 2022 Jul;162(1):60-79. doi: 10.1111/jnc.15607. Epub 2022 Mar 31.
6
Forgetting as a form of adaptive engram cell plasticity.
Nat Rev Neurosci. 2022 Mar;23(3):173-186. doi: 10.1038/s41583-021-00548-3. Epub 2022 Jan 13.
7
Mixing memory and desire: How memory reactivation supports deliberative decision-making.
Wiley Interdiscip Rev Cogn Sci. 2022 Mar;13(2):e1581. doi: 10.1002/wcs.1581. Epub 2021 Oct 19.
8
Optimal forgetting: Semantic compression of episodic memories.
PLoS Comput Biol. 2020 Oct 15;16(10):e1008367. doi: 10.1371/journal.pcbi.1008367. eCollection 2020 Oct.
9
Systems consolidation impairs behavioral flexibility.
Learn Mem. 2020 Apr 15;27(5):201-208. doi: 10.1101/lm.051243.119. Print 2020 May.

本文引用的文献

1
What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated.
Trends Cogn Sci. 2016 Jul;20(7):512-534. doi: 10.1016/j.tics.2016.05.004.
2
3
Human-level control through deep reinforcement learning.
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
4
Patterns across multiple memories are identified over time.
Nat Neurosci. 2014 Jul;17(7):981-6. doi: 10.1038/nn.3736. Epub 2014 Jun 1.
5
Multiple memory systems as substrates for multiple decision systems.
Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.
6
What is a memory schema? A historical perspective on current neuroscience literature.
Neuropsychologia. 2014 Jan;53:104-14. doi: 10.1016/j.neuropsychologia.2013.11.010. Epub 2013 Nov 23.
7
Goals and habits in the brain.
Neuron. 2013 Oct 16;80(2):312-25. doi: 10.1016/j.neuron.2013.09.007.
8
Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory.
J Exp Psychol Gen. 2013 Nov;142(4):1190-1210. doi: 10.1037/a0033812. Epub 2013 Aug 26.
9
The ubiquity of model-based reinforcement learning.
Curr Opin Neurobiol. 2012 Dec;22(6):1075-81. doi: 10.1016/j.conb.2012.08.003. Epub 2012 Sep 6.
10
Dopamine enhances model-based over model-free choice behavior.
Neuron. 2012 Aug 9;75(3):418-24. doi: 10.1016/j.neuron.2012.03.042.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验