IEEE Trans Image Process. 2018 Mar;27(3):1487-1500. doi: 10.1109/TIP.2017.2774041. Epub 2017 Nov 15.
Fine-grained image classification is to recognize hundreds of subcategories belonging to the same basic-level category, such as 200 subcategories belonging to the bird, which is highly challenging due to large variance in the same subcategory and small variance among different subcategories. Existing methods generally first locate the objects or parts and then discriminate which subcategory the image belongs to. However, they mainly have two limitations: 1) relying on object or part annotations which are heavily labor consuming; and 2) ignoring the spatial relationships between the object and its parts as well as among these parts, both of which are significantly helpful for finding discriminative parts. Therefore, this paper proposes the object-part attention model (OPAM) for weakly supervised fine-grained image classification and the main novelties are: 1) object-part attention model integrates two level attentions: object-level attention localizes objects of images, and part-level attention selects discriminative parts of object. Both are jointly employed to learn multi-view and multi-scale features to enhance their mutual promotion; and 2) Object-part spatial constraint model combines two spatial constraints: object spatial constraint ensures selected parts highly representative and part spatial constraint eliminates redundancy and enhances discrimination of selected parts. Both are jointly employed to exploit the subtle and local differences for distinguishing the subcategories. Importantly, neither object nor part annotations are used in our proposed approach, which avoids the heavy labor consumption of labeling. Compared with more than ten state-of-the-art methods on four widely-used datasets, our OPAM approach achieves the best performance.
细粒度图像分类是指识别属于同一基本类别(例如鸟类)的数百个子类别,这极具挑战性,因为同一子类别之间的差异较大,而不同子类别之间的差异较小。现有的方法通常首先定位对象或部分,然后再判断图像所属的子类别。然而,它们主要存在两个局限性:1)依赖于对象或部分注释,这些注释需要大量的人工劳动;2)忽略了对象与其部分之间以及这些部分之间的空间关系,这些关系对于找到具有区分度的部分非常有帮助。因此,本文提出了一种用于弱监督细粒度图像分类的对象-部分注意力模型(OPAM),其主要创新点在于:1)对象-部分注意力模型集成了两种注意力:对象级注意力定位图像中的对象,部分级注意力选择对象的有区分度的部分。这两种注意力共同用于学习多视图和多尺度特征,以增强它们之间的相互促进作用;2)对象-部分空间约束模型结合了两种空间约束:对象空间约束确保选择的部分具有高度代表性,部分空间约束消除冗余并增强所选部分的区分度。这两种约束共同用于挖掘细微的局部差异,以区分子类别。重要的是,我们提出的方法既不需要对象注释,也不需要部分注释,避免了标记的繁重劳动。与四个广泛使用的数据集上的十多种最先进的方法相比,我们的 OPAM 方法取得了最佳性能。