IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):16439-16452. doi: 10.1109/TNNLS.2023.3294495. Epub 2024 Oct 29.
Learning from a sequence of tasks for a lifetime is essential for an agent toward artificial general intelligence. Despite the explosion of this research field in recent years, most work focuses on the well-known catastrophic forgetting issue. In contrast, this work aims to explore knowledge-transferable lifelong learning without storing historical data and significant additional computational overhead. We demonstrate that existing data-free frameworks, including regularization-based single-network and structure-based multinetwork frameworks, face a fundamental issue of lifelong learning, named anterograde forgetting, i.e., preserving and transferring memory may inhibit the learning of new knowledge. We attribute it to the fact that the learning network capacity decreases while memorizing historical knowledge and conceptual confusion between the irrelevant old knowledge and the current task. Inspired by the complementary learning theory in neuroscience, we endow artificial neural networks with the ability to continuously learn without forgetting while recalling historical knowledge to facilitate learning new knowledge. Specifically, this work proposes a general framework named cycle memory networks (CMNs). The CMN consists of two individual memory networks to store short- and long-term memories separately to avoid capacity shrinkage and a transfer cell between them. It enables knowledge transfer from the long-term to the short-term memory network to mitigate conceptual confusion. In addition, the memory consolidation mechanism integrates short-term knowledge into the long-term memory network for knowledge accumulation. We demonstrate that the CMN can effectively address the anterograde forgetting on several task-related, task-conflict, class-incremental, and cross-domain benchmarks. Furthermore, we provide extensive ablation studies to verify each framework component. The source codes are available at: https://github.com/GeoX-Lab/CMN.
从一系列任务中学习对于实现人工通用智能的代理至关重要。尽管近年来该研究领域爆发式增长,但大多数工作都集中在广为人知的灾难性遗忘问题上。相比之下,这项工作旨在探索无需存储历史数据和大量额外计算开销的可迁移终身学习。我们证明了现有的无数据框架,包括基于正则化的单网络和基于结构的多网络框架,面临着一个终身学习的基本问题,即顺行性遗忘,即保留和转移记忆可能会抑制新知识的学习。我们将其归因于这样一个事实,即学习网络的容量在记忆历史知识的同时会减小,并且无关的旧知识和当前任务之间会产生概念混淆。受神经科学中互补学习理论的启发,我们为人工神经网络赋予了在回忆历史知识的同时不断学习而不会遗忘的能力,以促进新知识的学习。具体来说,这项工作提出了一个名为循环记忆网络(CMN)的通用框架。CMN 由两个单独的记忆网络组成,分别存储短期和长期记忆,以避免容量收缩和它们之间的传输单元。它能够将知识从长期记忆网络转移到短期记忆网络,从而减轻概念混淆。此外,记忆巩固机制将短期知识整合到长期记忆网络中,以实现知识积累。我们证明了 CMN 可以有效地解决几个与任务相关、任务冲突、类别递增和跨领域基准测试中的顺行性遗忘问题。此外,我们还提供了广泛的消融研究来验证每个框架组件。源代码可在以下网址获得:https://github.com/GeoX-Lab/CMN。