Li Depeng, Zeng Zhigang
IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):10731-10744. doi: 10.1109/TPAMI.2023.3262853. Epub 2023 Aug 7.
Artificial neural networks are prone to suffer from catastrophic forgetting. Networks trained on something new tend to rapidly forget what was learned previously, a common phenomenon within connectionist models. In this work, we propose an effective and efficient continual learning framework using random theory, together with Bayes' rule, to equip a single model with the ability to learn streaming data. The core idea of our framework is to preserve the performance of old tasks by guiding output weights to stay in a region of low error while encountering new tasks. In contrast to the existing continual learning approaches, our main contributions concern (1) closed-formed solutions with detailed theoretical analysis; (2) training continual learners by one-pass observation of samples; (3) remarkable advantages in terms of easy implementation, efficient parameters, fast convergence, and strong task-order robustness. Comprehensive experiments under popular image classification benchmarks, FashionMNIST, CIFAR-100, and ImageNet, demonstrate that our methods predominately outperform the extensive state-of-the-art methods on training speed while maintaining superior accuracy and the number of parameters, in the class incremental learning scenario. Code is available at https://github.com/toil2sweet/CRNet.
人工神经网络容易出现灾难性遗忘。在新事物上训练的网络往往会迅速忘记之前学到的东西,这是联结主义模型中的常见现象。在这项工作中,我们提出了一个有效且高效的持续学习框架,该框架使用随机理论以及贝叶斯规则,使单个模型具备学习流数据的能力。我们框架的核心思想是,在遇到新任务时,通过引导输出权重保持在低误差区域,来保留旧任务的性能。与现有的持续学习方法相比,我们的主要贡献在于:(1)具有详细理论分析的闭式解;(2)通过对样本的单次观察来训练持续学习者;(3)在易于实现、参数高效、收敛速度快和任务顺序鲁棒性强等方面具有显著优势。在流行的图像分类基准FashionMNIST、CIFAR - 100和ImageNet下进行的综合实验表明,在类增量学习场景中,我们的方法在训练速度上主要优于广泛的现有最先进方法,同时保持了卓越的准确性和参数数量。代码可在https://github.com/toil2sweet/CRNet获取。