NTCE-KD：非目标类增强知识蒸馏

NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation.

作者信息

Li Chuan, Teng Xiao, Ding Yan, Lan Long

机构信息

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China.

出版信息

Sensors (Basel). 2024 Jun 3;24(11):3617. doi: 10.3390/s24113617.

DOI:10.3390/s24113617

PMID:38894408

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11175301/

Abstract

Most logit-based knowledge distillation methods transfer soft labels from the teacher model to the student model via Kullback-Leibler divergence based on softmax, an exponential normalization function. However, this exponential nature of softmax tends to prioritize the largest class (target class) while neglecting smaller ones (non-target classes), leading to an oversight of the non-target classes's significance. To address this issue, we propose Non-Target-Class-Enhanced Knowledge Distillation (NTCE-KD) to amplify the role of non-target classes both in terms of magnitude and diversity. Specifically, we present a magnitude-enhanced Kullback-Leibler (MKL) divergence multi-shrinking the target class to enhance the impact of non-target classes in terms of magnitude. Additionally, to enrich the diversity of non-target classes, we introduce a diversity-based data augmentation strategy (DDA), further enhancing overall performance. Extensive experimental results on the CIFAR-100 and ImageNet-1k datasets demonstrate that non-target classes are of great significance and that our method achieves state-of-the-art performance across a wide range of teacher-student pairs.

摘要

大多数基于逻辑回归的知识蒸馏方法通过基于softmax（一种指数归一化函数）的库尔贝克-莱布勒散度，将教师模型的软标签传递给学生模型。然而，softmax的这种指数性质倾向于优先考虑最大的类别（目标类别），而忽略较小的类别（非目标类别），从而导致对非目标类别的重要性的忽视。为了解决这个问题，我们提出了非目标类增强知识蒸馏（NTCE-KD），以在幅度和多样性方面放大非目标类别的作用。具体来说，我们提出了一种幅度增强的库尔贝克-莱布勒（MKL）散度，多次收缩目标类别，以在幅度方面增强非目标类别的影响。此外，为了丰富非目标类别的多样性，我们引入了一种基于多样性的数据增强策略（DDA），进一步提高整体性能。在CIFAR-100和ImageNet-1k数据集上的大量实验结果表明，非目标类别具有重要意义，并且我们的方法在广泛的教师-学生对中实现了领先的性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

NTCE-KD：非目标类增强知识蒸馏

NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

NTCE-KD：非目标类增强知识蒸馏

NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献