Passalis Nikolaos, Tzelepi Maria, Tefas Anastasios
IEEE Trans Neural Netw Learn Syst. 2021 May;32(5):2030-2039. doi: 10.1109/TNNLS.2020.2995884. Epub 2021 May 3.
Knowledge-transfer (KT) methods allow for transferring the knowledge contained in a large deep learning model into a more lightweight and faster model. However, the vast majority of existing KT approaches are designed to handle mainly classification and detection tasks. This limits their performance on other tasks, such as representation/metric learning. To overcome this limitation, a novel probabilistic KT (PKT) method is proposed in this article. PKT is capable of transferring the knowledge into a smaller student model by keeping as much information as possible, as expressed through the teacher model. The ability of the proposed method to use different kernels for estimating the probability distribution of the teacher and student models, along with the different divergence metrics that can be used for transferring the knowledge, allows for easily adapting the proposed method to different applications. PKT outperforms several existing state-of-the-art KT techniques, while it is capable of providing new insights into KT by enabling several novel applications, as it is demonstrated through extensive experiments on several challenging data sets.
知识转移(KT)方法能够将大型深度学习模型中包含的知识转移到更轻量级、速度更快的模型中。然而,现有的绝大多数KT方法主要设计用于处理分类和检测任务。这限制了它们在其他任务上的性能,比如表示/度量学习。为了克服这一限制,本文提出了一种新颖的概率性KT(PKT)方法。PKT能够通过尽可能多地保留教师模型所表达的信息,将知识转移到更小的学生模型中。所提出的方法使用不同核来估计教师和学生模型的概率分布的能力,以及可用于知识转移的不同散度度量,使得该方法能够轻松适应不同的应用。PKT优于几种现有的先进KT技术,同时它能够通过实现多种新颖应用为KT提供新的见解,这一点在多个具有挑战性的数据集上进行的大量实验中得到了证明。