• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于知识蒸馏的持续学习:一项综述。

Continual Learning With Knowledge Distillation: A Survey.

作者信息

Li Songze, Su Tonghua, Zhang Xu-Yao, Wang Zhongjie

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3476068.

DOI:10.1109/TNNLS.2024.3476068
PMID:39423075
Abstract

The foremost challenge in continual learning is to mitigate catastrophic forgetting, allowing a model to retain knowledge of previous tasks while learning new tasks. Knowledge distillation (KD), a form of regularization, has gained significant attention for its ability to maintain a model's performance on previous tasks by mimicking the outputs of earlier models during the learning of new tasks, thus reducing forgetting. This article offers a comprehensive survey of continual learning methods employing KD within the realm of image classification. We provide a detailed analysis of how KD is utilized in continual learning methods, categorizing its application into three distinct paradigms. Besides, we classify these methods based on the type of knowledge source used and thoroughly examine how KD consolidates memory in continual learning from the perspective of loss functions. In addition, we have conducted extensive experiments on CIFAR-100, TinyImageNet, and ImageNet-100 across ten KD-integrated continual learning methods to analyze the role of KD in continual learning, and we have further discussed its effectiveness in other continual learning tasks. Our extensive experimental evidence demonstrates that KD plays a crucial role in mitigating forgetting in continual learning and substantiates that, when used with data replay, classification bias adversely affects the effectiveness of KD, whereas employing a separated softmax loss can significantly enhance its efficacy.

摘要

持续学习面临的首要挑战是减轻灾难性遗忘,使模型在学习新任务时能够保留对先前任务的知识。知识蒸馏(KD)作为一种正则化形式,因其在学习新任务时通过模仿早期模型的输出以保持模型在先前任务上的性能,从而减少遗忘的能力而备受关注。本文对图像分类领域中采用KD的持续学习方法进行了全面综述。我们详细分析了KD在持续学习方法中的应用方式,将其应用分为三种不同的范式。此外,我们根据所使用的知识源类型对这些方法进行分类,并从损失函数的角度深入研究KD在持续学习中如何巩固记忆。此外,我们针对十种集成KD的持续学习方法在CIFAR-100、TinyImageNet和ImageNet-100上进行了广泛实验,以分析KD在持续学习中的作用,并进一步讨论了其在其他持续学习任务中的有效性。我们广泛的实验证据表明,KD在减轻持续学习中的遗忘方面起着关键作用,并证实与数据重放一起使用时,分类偏差会对KD的有效性产生不利影响,而采用单独的softmax损失可以显著提高其功效。

相似文献

1
Continual Learning With Knowledge Distillation: A Survey.基于知识蒸馏的持续学习:一项综述。
IEEE Trans Neural Netw Learn Syst. 2024 Oct 18;PP. doi: 10.1109/TNNLS.2024.3476068.
2
Unveiling the Tapestry: The Interplay of Generalization and Forgetting in Continual Learning.揭开全貌:持续学习中泛化与遗忘的相互作用。
IEEE Trans Neural Netw Learn Syst. 2025 Mar 21;PP. doi: 10.1109/TNNLS.2025.3546269.
3
Subspace distillation for continual learning.用于持续学习的子空间蒸馏
Neural Netw. 2023 Oct;167:65-79. doi: 10.1016/j.neunet.2023.07.047. Epub 2023 Aug 6.
4
Continual Learning by Contrastive Learning of Regularized Classes in Multivariate Gaussian Distributions.通过多变量高斯分布中正则化类别的对比学习进行持续学习
Int J Neural Syst. 2025 Jun;35(6):2550025. doi: 10.1142/S012906572550025X. Epub 2025 Apr 4.
5
Variational Data-Free Knowledge Distillation for Continual Learning.用于持续学习的变分无数据知识蒸馏
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12618-12634. doi: 10.1109/TPAMI.2023.3271626. Epub 2023 Sep 5.
6
Mitigating carbon footprint for knowledge distillation based deep learning model compression.减轻基于知识蒸馏的深度学习模型压缩的碳足迹。
PLoS One. 2023 May 15;18(5):e0285668. doi: 10.1371/journal.pone.0285668. eCollection 2023.
7
Memory-Replay Knowledge Distillation.记忆重放知识蒸馏。
Sensors (Basel). 2021 Apr 15;21(8):2792. doi: 10.3390/s21082792.
8
Enhancing consistency and mitigating bias: A data replay approach for incremental learning.增强一致性并减轻偏差:一种用于增量学习的数据重放方法。
Neural Netw. 2025 Apr;184:107053. doi: 10.1016/j.neunet.2024.107053. Epub 2024 Dec 20.
9
A Continual Learning Survey: Defying Forgetting in Classification Tasks.持续学习调查:在分类任务中对抗遗忘
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3366-3385. doi: 10.1109/TPAMI.2021.3057446. Epub 2022 Jun 3.
10
Continual learning with attentive recurrent neural networks for temporal data classification.用于时态数据分类的基于注意力循环神经网络的持续学习
Neural Netw. 2023 Jan;158:171-187. doi: 10.1016/j.neunet.2022.10.031. Epub 2022 Nov 11.

引用本文的文献

1
KD_MultiSucc: incorporating multi-teacher knowledge distillation and word embeddings for cross-species prediction of protein succinylation sites.KD_MultiSucc:结合多教师知识蒸馏和词嵌入用于蛋白质琥珀酰化位点的跨物种预测
Biol Methods Protoc. 2025 May 28;10(1):bpaf041. doi: 10.1093/biomethods/bpaf041. eCollection 2025.
2
Confidence-Based, Collaborative, Distributed Continual Learning Framework for Non-Intrusive Load Monitoring in Smart Grids.用于智能电网中非侵入式负载监测的基于置信度的协作式分布式持续学习框架。
Sensors (Basel). 2025 Jun 11;25(12):3667. doi: 10.3390/s25123667.