IEEE Trans Pattern Anal Mach Intell. 2017 Mar;39(3):486-500. doi: 10.1109/TPAMI.2016.2552172. Epub 2016 Apr 8.
A weakly supervised semantic segmentation (WSSS) method aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels. By avoiding the tedious pixel-level annotation process, it can exploit the unlimited supply of user-tagged images from media-sharing sites such as Flickr for large scale applications. However, these `free' tags/labels are often noisy and few existing works address the problem of learning with both weak and noisy labels. In this work, we cast the WSSS problem into a label noise reduction problem. Specifically, after segmenting each image into a set of superpixels, the weak and potentially noisy image-level labels are propagated to the superpixel level resulting in highly noisy labels; the key to semantic segmentation is thus to identify and correct the superpixel noisy labels. To this end, a novel L-optimisation based sparse learning model is formulated to directly and explicitly detect noisy labels. To solve the L-optimisation problem, we further develop an efficient learning algorithm by introducing an intermediate labelling variable. Extensive experiments on three benchmark datasets show that our method yields state-of-the-art results given noise-free labels, whilst significantly outperforming the existing methods when the weak labels are also noisy.
一种弱监督语义分割 (WSSS) 方法旨在从弱(图像级)而非强(像素级)标签中学习分割模型。通过避免繁琐的像素级注释过程,它可以利用 Flickr 等媒体共享网站上用户标记的无限数量的图像来进行大规模应用。然而,这些“免费”标签/标签通常是嘈杂的,并且很少有现有工作解决学习弱标签和嘈杂标签的问题。在这项工作中,我们将 WSSS 问题转化为标签降噪问题。具体来说,在将每张图像分割成一组超像素之后,弱的和潜在嘈杂的图像级标签会传播到超像素级,从而产生高度嘈杂的标签;因此,语义分割的关键是识别和纠正超像素嘈杂标签。为此,我们提出了一种新的基于 L-优化的稀疏学习模型,用于直接和显式地检测嘈杂标签。为了解决 L-优化问题,我们通过引入中间标记变量进一步开发了一种有效的学习算法。在三个基准数据集上的广泛实验表明,在没有噪声标签的情况下,我们的方法可获得最先进的结果,而在弱标签也嘈杂的情况下,我们的方法明显优于现有方法。