Mittal Sudhanshu, Tatarchenko Maxim, Brox Thomas
IEEE Trans Pattern Anal Mach Intell. 2021 Apr;43(4):1369-1379. doi: 10.1109/TPAMI.2019.2960224. Epub 2021 Mar 4.
The ability to understand visual information from limited labeled data is an important aspect of machine learning. While image-level classification has been extensively studied in a semi-supervised setting, dense pixel-level classification with limited data has only drawn attention recently. In this work, we propose an approach for semi-supervised semantic segmentation that learns from limited pixel-wise annotated samples while exploiting additional annotation-free images. The proposed approach relies on adversarial training with a feature matching loss to learn from unlabeled images. It uses two network branches that link semi-supervised classification with semi-supervised segmentation including self-training. The dual-branch approach reduces both the low-level and the high-level artifacts typical when training with few labels. The approach attains significant improvement over existing methods, especially when trained with very few labeled samples. On several standard benchmarks-PASCAL VOC 2012, PASCAL-Context, and Cityscapes-the approach achieves new state-of-the-art in semi-supervised learning.
从有限的带标签数据中理解视觉信息的能力是机器学习的一个重要方面。虽然图像级分类在半监督设置中已得到广泛研究,但数据有限的密集像素级分类直到最近才受到关注。在这项工作中,我们提出了一种半监督语义分割方法,该方法从有限的逐像素标注样本中学习,同时利用额外的无标注图像。所提出的方法依赖于带有特征匹配损失的对抗训练,以便从未标注图像中学习。它使用两个网络分支,将半监督分类与包括自训练在内的半监督分割联系起来。这种双分支方法减少了在使用少量标签进行训练时典型的低级和高级伪影。该方法相对于现有方法有显著改进,特别是在使用极少标注样本进行训练时。在几个标准基准数据集——PASCAL VOC 2012、PASCAL-Context和Cityscapes上,该方法在半监督学习中达到了新的最优水平。