IEEE Trans Image Process. 2021;30:2784-2797. doi: 10.1109/TIP.2021.3054464. Epub 2021 Feb 12.
Recent advances in the joint processing of a set of images have shown its advantages over individual processing. Unlike the existing works geared towards co-segmentation or co-localization, in this article, we explore a new joint processing topic: image co-skeletonization, which is defined as joint skeleton extraction of the foreground objects in an image collection. It is well known that object skeletonization in a single natural image is challenging, because there is hardly any prior knowledge available about the object present in the image. Therefore, we resort to the idea of image co-skeletonization, hoping that the commonness prior that exists across the semantically similar images can be leveraged to have such knowledge, similar to other joint processing problems such as co-segmentation. Moreover, earlier research has found that augmenting a skeletonization process with the object's shape information is highly beneficial in capturing the image context. Having made these two observations, we propose a coupled framework for co-skeletonization and co-segmentation tasks to facilitate shape information discovery for our co-skeletonization process through the co-segmentation process. While image co-skeletonization is our primary goal, the co-segmentation process might also benefit, in turn, from exploiting skeleton outputs of the co-skeletonization process as central object seeds through such a coupled framework. As a result, both can benefit from each other synergistically. For evaluating image co-skeletonization results, we also construct a novel benchmark dataset by annotating nearly 1.8 K images and dividing them into 38 semantic categories. Although the proposed idea is essentially a weakly supervised method, it can also be employed in supervised and unsupervised scenarios. Extensive experiments demonstrate that the proposed method achieves promising results in all three scenarios.
最近在一组图像的联合处理方面的进展表明,它比单独处理具有优势。与现有的针对协同分割或协同定位的工作不同,本文探索了一个新的联合处理主题:图像协同骨骼化,它被定义为图像集合中前景对象的联合骨骼提取。众所周知,单个自然图像中的对象骨骼化具有挑战性,因为几乎没有关于图像中存在的对象的先验知识。因此,我们求助于图像协同骨骼化的想法,希望可以利用跨语义相似图像存在的共性先验来获得这种知识,类似于其他联合处理问题,如协同分割。此外,早期的研究发现,在骨骼化过程中增加对象形状信息对于捕捉图像上下文非常有益。基于这两个观察结果,我们提出了一种用于协同骨骼化和协同分割任务的耦合框架,通过协同分割过程为我们的协同骨骼化过程发现形状信息。虽然图像协同骨骼化是我们的主要目标,但协同分割过程也可以通过这种耦合框架从协同骨骼化过程的骨骼输出中作为中心对象种子来受益,从而彼此协同受益。因此,两者可以相互促进。为了评估图像协同骨骼化的结果,我们还通过注释近 1800 张图像并将其分为 38 个语义类别来构建一个新的基准数据集。尽管提出的想法本质上是一种弱监督方法,但它也可以用于监督和无监督场景。广泛的实验表明,所提出的方法在所有三个场景中都取得了有希望的结果。