IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8827-8844. doi: 10.1109/TPAMI.2022.3233584. Epub 2023 Jun 5.
Semi-supervised semantic segmentation aims to learn a semantic segmentation model via limited labeled images and adequate unlabeled images. The key to this task is generating reliable pseudo labels for unlabeled images. Existing methods mainly focus on producing reliable pseudo labels based on the confidence scores of unlabeled images while largely ignoring the use of labeled images with accurate annotations. In this paper, we propose a Cross-Image Semantic Consistency guided Rectifying (CISC-R) approach for semi-supervised semantic segmentation, which explicitly leverages the labeled images to rectify the generated pseudo labels. Our CISC-R is inspired by the fact that images belonging to the same class have a high pixel-level correspondence. Specifically, given an unlabeled image and its initial pseudo labels, we first query a guiding labeled image that shares the same semantic information with the unlabeled image. Then, we estimate the pixel-level similarity between the unlabeled image and the queried labeled image to form a CISC map, which guides us to achieve a reliable pixel-level rectification for the pseudo labels. Extensive experiments on the PASCAL VOC 2012, Cityscapes, and COCO datasets demonstrate that the proposed CISC-R can significantly improve the quality of the pseudo labels and outperform the state-of-the-art methods. Code is available at https://github.com/Luffy03/CISC-R.
半监督语义分割旨在通过有限的带标签图像和大量无标签图像来学习语义分割模型。这项任务的关键是为无标签图像生成可靠的伪标签。现有的方法主要集中于基于无标签图像的置信得分来生成可靠的伪标签,而在很大程度上忽略了使用具有准确注释的带标签图像。在本文中,我们提出了一种交叉图像语义一致性引导修正(CISC-R)方法用于半监督语义分割,该方法明确利用带标签图像来修正生成的伪标签。我们的 CISC-R 受到如下事实的启发:属于同一类别的图像具有很高的像素级对应关系。具体来说,给定一张无标签图像及其初始伪标签,我们首先查询一个与无标签图像具有相同语义信息的引导带标签图像。然后,我们估计无标签图像和查询的带标签图像之间的像素级相似性,以形成 CISC 图,该图指导我们对伪标签进行可靠的像素级修正。在 PASCAL VOC 2012、Cityscapes 和 COCO 数据集上的广泛实验表明,所提出的 CISC-R 可以显著提高伪标签的质量,并优于最先进的方法。代码可在 https://github.com/Luffy03/CISC-R 上获得。