Suppr超能文献

基于归因图的 CNN 可解释方法。

An attribution graph-based interpretable method for CNNs.

机构信息

School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, Shandong, China; State Key Laboratory of High-end Server & Storage Technology, Jinan, 250300, Shandong, China.

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China; State Key Laboratory of High-end Server & Storage Technology, Jinan, 250300, Shandong, China.

出版信息

Neural Netw. 2024 Nov;179:106597. doi: 10.1016/j.neunet.2024.106597. Epub 2024 Aug 5.

Abstract

Convolutional Neural Networks (CNNs) have demonstrated outstanding performance in various domains, such as face recognition, object detection, and image segmentation. However, the lack of transparency and limited interpretability inherent in CNNs pose challenges in fields such as medical diagnosis, autonomous driving, finance, and military applications. Several studies have explored the interpretability of CNNs and proposed various post-hoc interpretable methods. The majority of these methods are feature-based, focusing on the influence of input variables on outputs. Few methods undertake the analysis of parameters in CNNs and their overall structure. To explore the structure of CNNs and intuitively comprehend the role of their internal parameters, we propose an Attribution Graph-based Interpretable method for CNNs (AGIC) which models the overall structure of CNNs as graphs and provides interpretability from global and local perspectives. The runtime parameters of CNNs and feature maps of each image sample are applied to construct attribution graphs (At-GCs), where the convolutional kernels are represented as nodes and the SHAP values between kernel outputs are assigned as edges. These At-GCs are then employed to pretrain a newly designed heterogeneous graph encoder based on Deep Graph Infomax (DGI). To comprehensively delve into the overall structure of CNNs, the pretrained encoder is used for two types of interpretable tasks: (1) a classifier is attached to the pretrained encoder for the classification of At-GCs, revealing the dependency of At-GC's topological characteristics on the image sample categories, and (2) a scoring aggregation (SA) network is constructed to assess the importance of each node in At-GCs, thus reflecting the relative importance of kernels in CNNs. The experimental results indicate that the topological characteristics of At-GC exhibit a dependency on the sample category used in its construction, which reveals that kernels in CNNs show distinct combined activation patterns for processing different image categories, meanwhile, the kernels that receive high scores from SA network are crucial for feature extraction, whereas low-scoring kernels can be pruned without affecting model performance, thereby enhancing the interpretability of CNNs.

摘要

卷积神经网络 (CNN) 在人脸识别、目标检测和图像分割等各个领域都展现出了卓越的性能。然而,CNN 缺乏透明度和有限的可解释性,这在医学诊断、自动驾驶、金融和军事应用等领域带来了挑战。许多研究都探索了 CNN 的可解释性,并提出了各种事后可解释的方法。这些方法大多基于特征,主要关注输入变量对输出的影响。很少有方法分析 CNN 中的参数及其整体结构。为了探索 CNN 的结构,并直观地理解其内部参数的作用,我们提出了一种基于归因图的 CNN 可解释方法(AGIC),该方法将 CNN 的整体结构建模为图,并从全局和局部角度提供可解释性。该方法使用 CNN 的运行时参数和每个图像样本的特征图来构建归因图(At-GC),其中卷积核表示为节点,核输出之间的 SHAP 值被分配为边。然后,将这些 At-GC 应用于基于 Deep Graph Infomax (DGI) 的新设计的异构图编码器的预训练。为了全面研究 CNN 的整体结构,我们使用预训练的编码器进行两种类型的可解释性任务:(1) 将分类器附加到预训练的编码器上,用于 At-GC 的分类,揭示了 At-GC 的拓扑特征对图像样本类别的依赖关系;(2) 构建评分聚合 (SA) 网络来评估 At-GC 中每个节点的重要性,从而反映了 CNN 中核的相对重要性。实验结果表明,At-GC 的拓扑特征与其构建时所使用的样本类别具有依赖性,这表明 CNN 中的核在处理不同图像类别时表现出明显的组合激活模式,同时,SA 网络中得分较高的核对于特征提取至关重要,而得分较低的核可以被修剪而不影响模型性能,从而增强了 CNN 的可解释性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验