Division of Computer Science and Engineering, Jeonbuk National University, Jeollabuk-do 54896, Korea.
Research Center for Artificial Intelligence Technology, Jeonbuk National University, Jeollabuk-do 54896, Korea.
Sensors (Basel). 2021 Feb 11;21(4):1280. doi: 10.3390/s21041280.
Explaining the prediction of deep neural networks makes the networks more understandable and trusted, leading to their use in various mission critical tasks. Recent progress in the learning capability of networks has primarily been due to the enormous number of model parameters, so that it is usually hard to interpret their operations, as opposed to classical white-box models. For this purpose, generating saliency maps is a popular approach to identify the important input features used for the model prediction. Existing explanation methods typically only use the output of the last convolution layer of the model to generate a saliency map, lacking the information included in intermediate layers. Thus, the corresponding explanations are coarse and result in limited accuracy. Although the accuracy can be improved by iteratively developing a saliency map, this is too time-consuming and is thus impractical. To address these problems, we proposed a novel approach to explain the model prediction by developing an attentive surrogate network using the knowledge distillation. The surrogate network aims to generate a fine-grained saliency map corresponding to the model prediction using meaningful regional information presented over all network layers. Experiments demonstrated that the saliency maps are the result of spatially attentive features learned from the distillation. Thus, they are useful for fine-grained classification tasks. Moreover, the proposed method runs at the rate of 24.3 frames per second, which is much faster than the existing methods by orders of magnitude.
解释深度神经网络的预测结果可以使网络更加易于理解和信任,从而促进其在各种关键任务中的应用。最近,神经网络的学习能力取得了显著的进展,这主要得益于模型参数的大量增加,使得其操作通常难以解释,这与经典的白盒模型形成了鲜明的对比。为此,生成显著图是一种识别模型预测中使用的重要输入特征的常用方法。现有的解释方法通常仅使用模型最后一个卷积层的输出生成显著图,而缺乏中间层包含的信息。因此,相应的解释比较粗糙,导致准确性有限。虽然通过迭代开发显著图可以提高准确性,但这太耗时,不太实际。为了解决这些问题,我们提出了一种新的方法,通过使用知识蒸馏来开发注意代理网络来解释模型的预测。代理网络旨在使用来自蒸馏的有意义的区域信息,生成与模型预测相对应的细粒度显著图。实验表明,显著图是从蒸馏中学习到的空间注意特征的结果。因此,它们对细粒度分类任务非常有用。此外,该方法的运行速度为每秒 24.3 帧,比现有的方法快几个数量级。