IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):8082-8096. doi: 10.1109/TPAMI.2021.3083269. Epub 2022 Oct 4.
Weakly supervised semantic segmentation is receiving great attention due to its low human annotation cost. In this paper, we aim to tackle bounding box supervised semantic segmentation, i.e., training accurate semantic segmentation models using bounding box annotations as supervision. To this end, we propose affinity attention graph neural network ( AGNN). Following previous practices, we first generate pseudo semantic-aware seeds, which are then formed into semantic graphs based on our newly proposed affinity Convolutional Neural Network (CNN). Then the built graphs are input to our AGNN, in which an affinity attention layer is designed to acquire the short- and long- distance information from soft graph edges to accurately propagate semantic labels from the confident seeds to the unlabeled pixels. However, to guarantee the precision of the seeds, we only adopt a limited number of confident pixel seed labels for AGNN, which may lead to insufficient supervision for training. To alleviate this issue, we further introduce a new loss function and a consistency-checking mechanism to leverage the bounding box constraint, so that more reliable guidance can be included for the model optimization. Experiments show that our approach achieves new state-of-the-art performances on Pascal VOC 2012 datasets (val: 76.5 percent, test: 75.2 percent). More importantly, our approach can be readily applied to bounding box supervised instance segmentation task or other weakly supervised semantic segmentation tasks, with state-of-the-art or comparable performance among almot all weakly supervised tasks on PASCAL VOC or COCO dataset. Our source code will be available at https://github.com/zbf1991/A2GNN.
弱监督语义分割由于其较低的人工标注成本而受到广泛关注。在本文中,我们旨在解决边界框监督语义分割问题,即使用边界框标注作为监督来训练准确的语义分割模型。为此,我们提出了亲和注意力图神经网络(AGNN)。按照之前的做法,我们首先生成伪语义感知种子,然后根据我们新提出的亲和卷积神经网络(CNN)将其组成语义图。然后将构建的图输入到我们的 AGNN 中,在该网络中设计了亲和注意力层,以从软图边缘获取短程和长程信息,从而从置信种子准确地传播语义标签到未标记的像素。然而,为了保证种子的精度,我们只为 AGNN 采用有限数量的置信像素种子标签,这可能导致训练的监督不足。为了解决这个问题,我们进一步引入了新的损失函数和一致性检查机制,以利用边界框约束,从而为模型优化提供更可靠的指导。实验表明,我们的方法在 Pascal VOC 2012 数据集上取得了新的最先进的性能(val:76.5%,test:75.2%)。更重要的是,我们的方法可以很容易地应用于边界框监督实例分割任务或其他弱监督语义分割任务,在 PASCAL VOC 或 COCO 数据集上的几乎所有弱监督任务中都具有最先进的或可比的性能。我们的源代码将在 https://github.com/zbf1991/A2GNN 上提供。