IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):3949-3963. doi: 10.1109/TPAMI.2020.2993147. Epub 2021 Oct 1.
In this paper, we present a method to mine object-part patterns from conv-layers of a pre-trained convolutional neural network (CNN). The mined object-part patterns are organized by an And-Or graph (AOG). This interpretable AOG representation consists of a four-layer semantic hierarchy, i.e., semantic parts, part templates, latent patterns, and neural units. The AOG associates each object part with certain neural units in feature maps of conv-layers. The AOG is constructed with very few annotations (e.g., 3-20) of object parts. We develop a question-answering (QA) method that uses active human-computer communications to mine patterns from a pre-trained CNN, in order to explain features in conv-layers incrementally. During the learning process, our QA method uses the current AOG for part localization. The QA method actively identifies objects, whose feature maps cannot be explained by the AOG. Then, our method asks people to annotate parts on the unexplained objects, and uses answers to discover CNN patterns corresponding to newly labeled parts. In this way, our method gradually grows new branches and refines existing branches on the AOG to semanticize CNN representations. In experiments, our method exhibited a high learning efficiency. Our method used about 1/6- 1/3 of the part annotations for training, but achieved similar or better part-localization performance than fast-RCNN methods.
在本文中,我们提出了一种从预先训练的卷积神经网络(CNN)的卷积层中挖掘目标-部分模式的方法。挖掘出的目标-部分模式由与或图(AOG)组织。这种可解释的 AOG 表示由四层语义层次结构组成,即语义部分、部分模板、潜在模式和神经单元。AOG 将每个对象部分与卷积层特征图中的某些神经单元相关联。AOG 是使用很少的对象部分注释(例如 3-20)构建的。我们开发了一种问答(QA)方法,该方法使用主动人机通信从预先训练的 CNN 中挖掘模式,以便逐步解释卷积层中的特征。在学习过程中,我们的 QA 方法使用当前的 AOG 进行部分定位。QA 方法主动识别特征图无法用 AOG 解释的对象。然后,我们的方法要求人们在未解释的对象上标注部分,并使用答案发现与新标记部分对应的 CNN 模式。通过这种方式,我们的方法逐渐在 AOG 上生成新的分支并细化现有分支,以对 CNN 表示进行语义化。在实验中,我们的方法表现出很高的学习效率。我们的方法使用了大约 1/6-1/3 的部分注释进行训练,但与 fast-RCNN 方法相比,实现了类似或更好的部分定位性能。