Zhang Quanshi, Wang Xin, Cao Ruiming, Wu Ying Nian, Shi Feng, Zhu Song-Chun
IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):3863-3877. doi: 10.1109/TPAMI.2020.2992207. Epub 2021 Oct 1.
This paper introduces an explanatory graph representation to reveal object parts encoded inside convolutional layers of a CNN. Given a pre-trained CNN, each filter in a conv-layer usually represents a mixture of object parts. We develop a simple yet effective method to learn an explanatory graph, which automatically disentangles object parts from each filter without any part annotations. Specifically, given the feature map of a filter, we mine neural activations from the feature map, which correspond to different object parts. The explanatory graph is constructed to organize each mined part as a graph node. Each edge connects two nodes, whose corresponding object parts usually co-activate and keep a stable spatial relationship. Experiments show that each graph node consistently represented the same object part through different images, which boosted the transferability of CNN features. The explanatory graph transferred features of object parts to the task of part localization, and our method significantly outperformed other approaches.
本文介绍了一种解释性图表示法,以揭示编码在卷积神经网络(CNN)卷积层内部的对象部分。给定一个预训练的CNN,卷积层中的每个滤波器通常表示对象部分的混合。我们开发了一种简单而有效的方法来学习解释性图,该方法无需任何部分注释即可自动从每个滤波器中分离出对象部分。具体而言,给定一个滤波器的特征图,我们从该特征图中挖掘与不同对象部分相对应的神经激活。解释性图被构建为将每个挖掘出的部分组织为一个图节点。每条边连接两个节点,其对应的对象部分通常会共同激活并保持稳定的空间关系。实验表明,每个图节点在不同图像中始终表示相同的对象部分,这提高了CNN特征的可迁移性。解释性图将对象部分的特征转移到部分定位任务中,并且我们的方法显著优于其他方法。