IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4243-4256. doi: 10.1109/TNNLS.2021.3056201. Epub 2022 Aug 31.
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current task's parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. Our approach encourages the synapse to be sparse and polarized, which enables long-term learning and memory. ANPyC exhibits effectiveness and generalization on both image classification and generation tasks with multiple layer perceptron, convolutional neural networks, and generative adversarial networks, and variational autoencoder. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
使神经网络能够顺序学习多个任务对于扩大神经网络在实际应用中的适用性具有重要意义。然而,人工神经网络面临着众所周知的灾难性遗忘问题。更糟糕的是,随着任务序列的增加,先前学习的技能的退化变得更加严重,这被称为长期灾难性遗忘。这是由于两个事实:首先,随着模型学习更多的任务,满足这些任务的低误差参数子空间的交集变得更小甚至不存在;其次,当模型学习新任务时,由于模型试图保护先前任务的参数配置免受干扰,累积误差不断增加。受哺乳动物大脑中具有突触可塑性的记忆巩固机制的启发,我们提出了一种对抗机制,即对抗性神经修剪和突触巩固(ANPyC),以克服长期灾难性遗忘问题。神经修剪作为长时程抑制作用来修剪与任务无关的参数,而新的突触巩固作为长时程增强作用来增强与任务相关的参数。在训练过程中,这种对抗达到了平衡,只有关键参数保留,非重要参数被释放以学习后续任务。ANPyC 避免了遗忘重要信息,并使模型能够高效地学习大量任务。具体来说,神经修剪迭代地放宽当前任务的参数条件,以扩展任务的公共参数子空间;突触巩固策略由结构感知的参数重要性度量和逐元素参数更新策略组成,当学习新任务时会减少累积误差。我们的方法鼓励突触稀疏和极化,从而实现长期学习和记忆。ANPyC 在具有多层感知机、卷积神经网络和生成对抗网络以及变分自动编码器的图像分类和生成任务上都表现出了有效性和泛化性。完整的源代码可在 https://github.com/GeoX-Lab/ANPyC 上获得。