Moon Hyung-Jun, Cho Sung-Bae
Department of Artificial Intelligence, Yonsei University, 50 Yonsei-ro, Sudaemoon-gu, Seoul 03722, South Korea.
Department of Computer Science, Yonsei University, 50 Yonsei-ro, Sudaemoon-gu, Seoul 03722, South Korea.
Int J Neural Syst. 2025 Jun;35(6):2550025. doi: 10.1142/S012906572550025X. Epub 2025 Apr 4.
Deep neural networks struggle with incremental updates due to catastrophic forgetting, where newly acquired knowledge interferes with the learned previously. Continual learning (CL) methods aim to overcome this limitation by effectively updating the model without losing previous knowledge, but they find it difficult to continuously maintain knowledge about previous tasks, resulting from overlapping stored information. In this paper, we propose a CL method that preserves previous knowledge as multivariate Gaussian distributions by independently storing the model's outputs per class and continually reproducing them for future tasks. We enhance the discriminability between classes and ensure the plasticity for future tasks by exploiting contrastive learning and representation regularization. The class-wise spatial means and covariances, distinguished in the latent space, are stored in memory, where the previous knowledge is effectively preserved and reproduced for incremental tasks. Extensive experiments on benchmark datasets such as CIFAR-10, CIFAR-100, and ImageNet-100 demonstrate that the proposed method achieves accuracies of 93.21%, 77.57%, and 78.15%, respectively, outperforming state-of-the-art CL methods by 2.34 %p, 2.1 %p, and 1.91 %p. Additionally, it achieves the lowest mean forgetting rates across all datasets.
深度神经网络由于灾难性遗忘而难以进行增量更新,即新获取的知识会干扰先前学到的知识。持续学习(CL)方法旨在通过有效更新模型而不丢失先前知识来克服这一限制,但由于存储信息的重叠,它们发现难以持续维护有关先前任务的知识。在本文中,我们提出了一种CL方法,该方法通过按类独立存储模型的输出并为未来任务持续再现这些输出,将先前知识保存为多元高斯分布。我们通过利用对比学习和表示正则化来增强类之间的可辨别性,并确保未来任务的可塑性。在潜在空间中区分的逐类空间均值和协方差被存储在内存中,在那里先前知识被有效地保存并再现用于增量任务。在CIFAR-10、CIFAR-100和ImageNet-100等基准数据集上进行的大量实验表明,所提出的方法分别实现了93.21%、77.57%和78.15%的准确率,比现有最先进的CL方法分别高出2.34个百分点、2.1个百分点和1.91个百分点。此外,它在所有数据集中实现了最低的平均遗忘率。