Gao Honghao, Xiao Junsheng, Yin Yuyu, Liu Tong, Shi Jiangang
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):4826-4838. doi: 10.1109/TNNLS.2022.3155486. Epub 2024 Apr 4.
Fully supervised semantic segmentation has performed well in many computer vision tasks. However, it is time-consuming because training a model requires a large number of pixel-level annotated samples. Few-shot segmentation has recently become a popular approach to addressing this problem, as it requires only a handful of annotated samples to generalize to new categories. However, the full utilization of limited samples remains an open problem. Thus, in this article, a mutually supervised few-shot segmentation network is proposed. First, the feature maps from intermediate convolution layers are fused to enrich the capacity of feature representation. Second, the support image and query image are combined into a bipartite graph, and the graph attention network is adopted to avoid losing spatial information and increase the number of pixels in the support image to guide the query image segmentation. Third, the attention map of the query image is used as prior information to enhance the support image segmentation, which forms a mutually supervised regime. Finally, the attention maps of the intermediate layers are fused and sent into the graph reasoning layer to infer the pixel categories. Experiments are conducted on the PASCAL VOC- 5 dataset and FSS-1000 dataset, and the results demonstrate the effectiveness and superior performance of our method compared with other baseline methods.
全监督语义分割在许多计算机视觉任务中表现良好。然而,它很耗时,因为训练一个模型需要大量的像素级标注样本。少样本分割最近已成为解决此问题的一种流行方法,因为它只需要少量标注样本就能推广到新类别。然而,如何充分利用有限的样本仍然是一个未解决的问题。因此,在本文中,我们提出了一种相互监督的少样本分割网络。首先,融合中间卷积层的特征图以增强特征表示能力。其次,将支持图像和查询图像组合成一个二分图,并采用图注意力网络来避免丢失空间信息,并增加支持图像中的像素数量以指导查询图像分割。第三,将查询图像的注意力图用作先验信息来增强支持图像分割,从而形成一种相互监督机制。最后,融合中间层的注意力图并将其送入图推理层以推断像素类别。我们在PASCAL VOC-5数据集和FSS-1000数据集上进行了实验,结果表明我们的方法与其他基线方法相比具有有效性和优越的性能。