Zhang Xiaolin, Wei Yunchao, Yang Yi, Huang Thomas S
IEEE Trans Cybern. 2020 Sep;50(9):3855-3865. doi: 10.1109/TCYB.2020.2992433. Epub 2020 Jun 4.
One-shot image semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this article, we propose a simple yet effective similarity guidance network to tackle the one-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image of the same category. To obtain the robust representative feature of the support image, we first adopt a masked average pooling strategy for producing the guidance features by only taking the pixels belonging to the support image into account. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adopted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework that can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SG-One achieves the mIoU score of 46.3%, surpassing the baseline methods.
一次性图像语义分割提出了一项具有挑战性的任务,即仅以一个带注释的示例作为监督,从未见过的类别中识别对象区域。在本文中,我们提出了一种简单而有效的相似性引导网络来解决一次性(SG-One)分割问题。我们旨在参考同一类别的一个密集标注的支持图像来预测查询图像的分割掩码。为了获得支持图像的鲁棒代表性特征,我们首先采用掩码平均池化策略,仅考虑属于支持图像的像素来生成引导特征。然后,我们利用余弦相似度来建立引导特征与查询图像像素特征之间的关系。通过这种方式,生成的相似性图中嵌入的可能性可用于指导对象分割过程。此外,我们的SG-One是一个统一的框架,可以在一个网络内高效地处理支持图像和查询图像,并以端到端的方式进行学习。我们在Pascal VOC 2012上进行了广泛的实验。特别是,我们的SG-One实现了46.3%的平均交并比得分,超过了基线方法。