通过相互对比学习进行视觉识别的在线知识蒸馏

Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition.

作者信息

Yang Chuanguang, An Zhulin, Zhou Helong, Zhuang Fuzhen, Xu Yongjun, Zhang Qian

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):10212-10227. doi: 10.1109/TPAMI.2023.3257878. Epub 2023 Jun 30.

DOI:10.1109/TPAMI.2023.3257878

Abstract

The teacher-free online Knowledge Distillation (KD) aims to train an ensemble of multiple student models collaboratively and distill knowledge from each other. Although existing online KD methods achieve desirable performance, they often focus on class probabilities as the core knowledge type, ignoring the valuable feature representational information. We present a Mutual Contrastive Learning (MCL) framework for online KD. The core idea of MCL is to perform mutual interaction and transfer of contrastive distributions among a cohort of networks in an online manner. Our MCL can aggregate cross-network embedding information and maximize the lower bound to the mutual information between two networks. This enables each network to learn extra contrastive knowledge from others, leading to better feature representations, thus improving the performance of visual recognition tasks. Beyond the final layer, we extend MCL to intermediate layers and perform an adaptive layer-matching mechanism trained by meta-optimization. Experiments on image classification and transfer learning to visual recognition tasks show that layer-wise MCL can lead to consistent performance gains against state-of-the-art online KD approaches. The superiority demonstrates that layer-wise MCL can guide the network to generate better feature representations. Our code is publicly avaliable at https://github.com/winycg/L-MCL.

摘要

无教师在线知识蒸馏（KD）旨在协同训练多个学生模型的集合，并相互蒸馏知识。尽管现有的在线KD方法取得了理想的性能，但它们通常将类别概率作为核心知识类型，而忽略了有价值的特征表示信息。我们提出了一种用于在线KD的相互对比学习（MCL）框架。MCL的核心思想是以在线方式在一组网络之间进行对比分布的相互交互和传递。我们的MCL可以聚合跨网络嵌入信息，并最大化两个网络之间互信息的下界。这使得每个网络能够从其他网络学习额外的对比知识，从而产生更好的特征表示，进而提高视觉识别任务的性能。除了最后一层，我们还将MCL扩展到中间层，并执行通过元优化训练的自适应层匹配机制。在图像分类和向视觉识别任务的迁移学习上的实验表明，分层MCL相对于最先进的在线KD方法能够带来一致的性能提升。这种优越性表明分层MCL可以引导网络生成更好的特征表示。我们的代码可在https://github.com/winycg/L-MCL上公开获取。

相似文献

Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):10212-10227. doi: 10.1109/TPAMI.2023.3257878. Epub 2023 Jun 30.

Knowledge Distillation Using Hierarchical Self-Supervision Augmented Distribution.

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2094-2108. doi: 10.1109/TNNLS.2022.3186807. Epub 2024 Feb 5.

DCCD: Reducing Neural Network Redundancy via Distillation.

IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):10006-10017. doi: 10.1109/TNNLS.2023.3238337. Epub 2024 Jul 8.

Leveraging different learning styles for improved knowledge distillation in biomedical imaging.

Comput Biol Med. 2024 Jan;168:107764. doi: 10.1016/j.compbiomed.2023.107764. Epub 2023 Nov 30.

Spot-Adaptive Knowledge Distillation.

IEEE Trans Image Process. 2022;31:3359-3370. doi: 10.1109/TIP.2022.3170728. Epub 2022 May 9.

Knowledge Distillation Meets Label Noise Learning: Ambiguity-Guided Mutual Label Refinery.

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):939-952. doi: 10.1109/TNNLS.2023.3335829. Epub 2025 Jan 7.

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):366-377. doi: 10.1109/TNNLS.2020.3027837. Epub 2022 Jan 5.

Multi-task prediction-based graph contrastive learning for inferring the relationship among lncRNAs, miRNAs and diseases.

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad276.

STKD: Distilling Knowledge From Synchronous Teaching for Efficient Model Compression.

IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10051-10064. doi: 10.1109/TNNLS.2022.3164264. Epub 2023 Nov 30.

A General Dynamic Knowledge Distillation Method for Visual Analytics.

IEEE Trans Image Process. 2022 Oct 13;PP. doi: 10.1109/TIP.2022.3212905.

引用本文的文献

Foundation models and intelligent decision-making: Progress, challenges, and perspectives.

Innovation (Camb). 2025 May 12;6(6):100948. doi: 10.1016/j.xinn.2025.100948. eCollection 2025 Jun 2.

A contrast enhanced representation normalization approach to knowledge distillation.

Sci Rep. 2025 Apr 16;15(1):13197. doi: 10.1038/s41598-025-97699-5.

Mixed Mutual Transfer for Long-Tailed Image Classification.

Entropy (Basel). 2024 Oct 2;26(10):839. doi: 10.3390/e26100839.

FCKDNet: A Feature Condensation Knowledge Distillation Network for Semantic Segmentation.

Entropy (Basel). 2023 Jan 7;25(1):125. doi: 10.3390/e25010125.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过相互对比学习进行视觉识别的在线知识蒸馏

Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献