IEEE Trans Image Process. 2018 Sep;27(9):4503-4515. doi: 10.1109/TIP.2018.2839901.
Motivated by the recent success of supervised and weakly supervised common object discovery, in this paper, we move forward one step further to tackle common object discovery in a fully unsupervised way. Generally, object co-localization aims at simultaneously localizing objects of the same class across a group of images. Traditional object localization/detection usually trains specific object detectors which require bounding box annotations of object instances, or at least image-level labels to indicate the presence/absence of objects in an image. Given a collection of images without any annotations, our proposed fully unsupervised method is to simultaneously discover images that contain common objects and also localize common objects in corresponding images. Without requiring to know the total number of common objects, we formulate this unsupervised object discovery as a sub-graph mining problem from a weighted graph of object proposals, where nodes correspond to object proposals, and edges represent the similarities between neighbouring proposals. The positive images and common objects are jointly discovered by finding sub-graphs of strongly connected nodes, with each sub-graph capturing one object pattern. The optimization problem can be efficiently solved by our proposed maximal-flow-based algorithm. Instead of assuming that each image contains only one common object, our proposed solution can better address wild images where each image may contain multiple common objects or even no common object. Moreover, our proposed method can be easily tailored to the task of image retrieval in which the nodes correspond to the similarity between query and reference images. Extensive experiments on PASCAL VOC 2007 and Object Discovery data sets demonstrate that even without any supervision, our approach can discover/localize common objects of various classes in the presence of scale, view point, appearance variation, and partial occlusions. We also conduct broad experiments on image retrieval benchmarks, Holidays and Oxford5k data sets, to show that our proposed method, which considers both the similarity between query and reference images and also similarities among reference images, can help to improve the retrieval results significantly.
受监督和弱监督共同目标发现的最新成功的启发,在本文中,我们更进一步,以完全无监督的方式解决共同目标发现的问题。通常,目标共定位旨在同时定位一组图像中同一类别的对象。传统的目标定位/检测通常训练特定的目标检测器,这些检测器需要对象实例的边界框注释,或者至少需要图像级别的标签来指示图像中对象的存在/不存在。在没有任何注释的情况下,我们提出的完全无监督的方法是同时发现包含共同对象的图像,并在相应的图像中定位共同对象。在不需要知道共同对象的总数的情况下,我们将这个无监督的对象发现形式化为从对象提议的加权图中挖掘子图的问题,其中节点对应于对象提议,边表示相邻提议之间的相似度。通过找到强连通节点的子图来共同发现正样本图像和共同对象,每个子图捕获一个对象模式。优化问题可以通过我们提出的最大流算法有效地解决。我们提出的解决方案不是假设每个图像只包含一个共同对象,而是可以更好地处理野生图像,这些图像可能包含多个共同对象,甚至没有共同对象。此外,我们提出的方法可以很容易地应用于图像检索任务中,其中节点对应于查询和参考图像之间的相似性。在 PASCAL VOC 2007 和 Object Discovery 数据集上进行了广泛的实验,证明了即使没有任何监督,我们的方法也可以在存在尺度、视点、外观变化和部分遮挡的情况下发现/定位各种类别的共同对象。我们还在图像检索基准 Holidays 和 Oxford5k 数据集上进行了广泛的实验,表明我们提出的方法,同时考虑查询和参考图像之间的相似性以及参考图像之间的相似性,有助于显著提高检索结果。