• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

NTCE-KD:非目标类增强知识蒸馏

NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation.

作者信息

Li Chuan, Teng Xiao, Ding Yan, Lan Long

机构信息

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China.

出版信息

Sensors (Basel). 2024 Jun 3;24(11):3617. doi: 10.3390/s24113617.

DOI:10.3390/s24113617
PMID:38894408
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11175301/
Abstract

Most logit-based knowledge distillation methods transfer soft labels from the teacher model to the student model via Kullback-Leibler divergence based on softmax, an exponential normalization function. However, this exponential nature of softmax tends to prioritize the largest class (target class) while neglecting smaller ones (non-target classes), leading to an oversight of the non-target classes's significance. To address this issue, we propose Non-Target-Class-Enhanced Knowledge Distillation (NTCE-KD) to amplify the role of non-target classes both in terms of magnitude and diversity. Specifically, we present a magnitude-enhanced Kullback-Leibler (MKL) divergence multi-shrinking the target class to enhance the impact of non-target classes in terms of magnitude. Additionally, to enrich the diversity of non-target classes, we introduce a diversity-based data augmentation strategy (DDA), further enhancing overall performance. Extensive experimental results on the CIFAR-100 and ImageNet-1k datasets demonstrate that non-target classes are of great significance and that our method achieves state-of-the-art performance across a wide range of teacher-student pairs.

摘要

大多数基于逻辑回归的知识蒸馏方法通过基于softmax(一种指数归一化函数)的库尔贝克-莱布勒散度,将教师模型的软标签传递给学生模型。然而,softmax的这种指数性质倾向于优先考虑最大的类别(目标类别),而忽略较小的类别(非目标类别),从而导致对非目标类别的重要性的忽视。为了解决这个问题,我们提出了非目标类增强知识蒸馏(NTCE-KD),以在幅度和多样性方面放大非目标类别的作用。具体来说,我们提出了一种幅度增强的库尔贝克-莱布勒(MKL)散度,多次收缩目标类别,以在幅度方面增强非目标类别的影响。此外,为了丰富非目标类别的多样性,我们引入了一种基于多样性的数据增强策略(DDA),进一步提高整体性能。在CIFAR-100和ImageNet-1k数据集上的大量实验结果表明,非目标类别具有重要意义,并且我们的方法在广泛的教师-学生对中实现了领先的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/c634418206bc/sensors-24-03617-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/6995cc5c952b/sensors-24-03617-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/8668c0a8c111/sensors-24-03617-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/eafe67a9ab47/sensors-24-03617-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/68fdde08327b/sensors-24-03617-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/605e0bcb9db3/sensors-24-03617-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/c634418206bc/sensors-24-03617-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/6995cc5c952b/sensors-24-03617-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/8668c0a8c111/sensors-24-03617-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/eafe67a9ab47/sensors-24-03617-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/68fdde08327b/sensors-24-03617-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/605e0bcb9db3/sensors-24-03617-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e68d/11175301/c634418206bc/sensors-24-03617-g006.jpg

相似文献

1
NTCE-KD: Non-Target-Class-Enhanced Knowledge Distillation.NTCE-KD:非目标类增强知识蒸馏
Sensors (Basel). 2024 Jun 3;24(11):3617. doi: 10.3390/s24113617.
2
Memory-Replay Knowledge Distillation.记忆重放知识蒸馏。
Sensors (Basel). 2021 Apr 15;21(8):2792. doi: 10.3390/s21082792.
3
Multi-teacher knowledge distillation based on joint Guidance of Probe and Adaptive Corrector.基于探针和自适应校正器联合引导的多教师知识蒸馏。
Neural Netw. 2023 Jul;164:345-356. doi: 10.1016/j.neunet.2023.04.015. Epub 2023 Apr 26.
4
Mitigating carbon footprint for knowledge distillation based deep learning model compression.减轻基于知识蒸馏的深度学习模型压缩的碳足迹。
PLoS One. 2023 May 15;18(5):e0285668. doi: 10.1371/journal.pone.0285668. eCollection 2023.
5
Decoupled graph knowledge distillation: A general logits-based method for learning MLPs on graphs.解耦图知识蒸馏:一种基于对数的在图上学习 MLP 的通用方法。
Neural Netw. 2024 Nov;179:106567. doi: 10.1016/j.neunet.2024.106567. Epub 2024 Jul 23.
6
Complementary label learning based on knowledge distillation.基于知识蒸馏的互补标签学习。
Math Biosci Eng. 2023 Sep 19;20(10):17905-17918. doi: 10.3934/mbe.2023796.
7
Dual Distillation Discriminator Networks for Domain Adaptive Few-Shot Learning.双蒸馏鉴别器网络用于领域自适应少样本学习。
Neural Netw. 2023 Aug;165:625-633. doi: 10.1016/j.neunet.2023.06.009. Epub 2023 Jun 15.
8
Highlight Every Step: Knowledge Distillation via Collaborative Teaching.突出每个步骤:通过协作教学进行知识蒸馏。
IEEE Trans Cybern. 2022 Apr;52(4):2070-2081. doi: 10.1109/TCYB.2020.3007506. Epub 2022 Apr 5.
9
Cosine similarity knowledge distillation for surface anomaly detection.用于表面异常检测的余弦相似度知识蒸馏
Sci Rep. 2024 Apr 8;14(1):8150. doi: 10.1038/s41598-024-58409-9.
10
DCCD: Reducing Neural Network Redundancy via Distillation.DCCD:通过蒸馏减少神经网络冗余
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):10006-10017. doi: 10.1109/TNNLS.2023.3238337. Epub 2024 Jul 8.

本文引用的文献

1
Enhancement, integration, expansion: Activating representation of detailed features for occluded person re-identification.增强、集成、扩展:激活遮挡行人再识别中详细特征的表示。
Neural Netw. 2024 Jan;169:532-541. doi: 10.1016/j.neunet.2023.11.003. Epub 2023 Nov 7.
2
Highly Efficient Active Learning With Tracklet-Aware Co-Cooperative Annotators for Person Re-Identification.用于行人重识别的基于轨迹感知协同注释器的高效主动学习
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15687-15700. doi: 10.1109/TNNLS.2023.3289178. Epub 2024 Oct 29.
3
Learning to Purification for Unsupervised Person Re-Identification.
学习用于无监督行人重识别的净化方法。
IEEE Trans Image Process. 2023;32:3338-3353. doi: 10.1109/TIP.2023.3278860. Epub 2023 Jun 15.
4
Interacting Tracklets for Multi-object Tracking.用于多目标跟踪的交互轨迹段
IEEE Trans Image Process. 2018 Sep;27(9):4585-4597. doi: 10.1109/TIP.2018.2843129. Epub 2018 Jun 1.
5
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.