Li Kunchi, Wan Jun, Yu Shan
IEEE Trans Image Process. 2022;31:3825-3837. doi: 10.1109/TIP.2022.3176130. Epub 2022 Jun 2.
Recently, owing to the superior performances, knowledge distillation-based (kd-based) methods with the exemplar rehearsal have been widely applied in class incremental learning (CIL). However, we discover that they suffer from the feature uncalibration problem, which is caused by directly transferring knowledge from the old model immediately to the new model when learning a new task. As the old model confuses the feature representations between the learned and new classes, the kd loss and the classification loss used in kd-based methods are heterogeneous. This is detrimental if we learn the existing knowledge from the old model directly in the way as in typical kd-based methods. To tackle this problem, the feature calibration network (FCN) is proposed, which is used to calibrate the existing knowledge to alleviate the feature representation confusion of the old model. In addition, to relieve the task-recency bias of FCN caused by the limited storage memory in CIL, we propose a novel image-feature hybrid sample rehearsal strategy to train FCN by splitting the memory budget to store the image-and-feature exemplars of the previous tasks. As feature embeddings of images have much lower-dimensions, this allows us to store more samples to train FCN. Based on these two improvements, we propose the Cascaded Knowledge Distillation Framework (CKDF) including three main stages. The first stage is used to train FCN to calibrate the existing knowledge of the old model. Then, the new model is trained simultaneously by transferring knowledge from the calibrated teacher model through the knowledge distillation strategy and learning new classes. Finally, after completing the new task learning, the feature exemplars of previous tasks are updated. Importantly, we demonstrate that the proposed CKDF is a general framework that can be applied to various kd-based methods. Experimental results show that our method achieves state-of-the-art performances on several CIL benchmarks.
最近,由于性能优越,基于知识蒸馏(KD)并结合样本排练的方法已广泛应用于类别增量学习(CIL)。然而,我们发现它们存在特征未校准问题,这是在学习新任务时直接将旧模型的知识立即转移到新模型所导致的。由于旧模型混淆了已学类别和新类别之间的特征表示,基于KD的方法中使用的KD损失和分类损失是异构的。如果我们像在典型的基于KD的方法中那样直接从旧模型学习现有知识,这是有害的。为了解决这个问题,我们提出了特征校准网络(FCN),用于校准现有知识,以减轻旧模型的特征表示混淆。此外,为了缓解CIL中有限存储内存导致的FCN任务近期偏差,我们提出了一种新颖的图像-特征混合样本排练策略,通过分割内存预算来存储先前任务的图像和特征样本,从而训练FCN。由于图像的特征嵌入维度低得多,这使我们能够存储更多样本以训练FCN。基于这两项改进,我们提出了级联知识蒸馏框架(CKDF),包括三个主要阶段。第一阶段用于训练FCN以校准旧模型的现有知识。然后,通过知识蒸馏策略从校准后的教师模型转移知识并学习新类别,同时训练新模型。最后,在完成新任务学习后,更新先前任务的特征样本。重要的是,我们证明了所提出的CKDF是一个通用框架,可应用于各种基于KD的方法。实验结果表明,我们的方法在几个CIL基准测试中取得了领先的性能。