Ji Wei, Li Xi, Wei Lina, Wu Fei, Zhuang Yueting
IEEE Trans Image Process. 2020 Jun 25;PP. doi: 10.1109/TIP.2020.3002083.
Recently, a large number of existing methods for saliency detection have mainly focused on designing complex network architectures to aggregate powerful features from backbone networks. However, contextual information is not well utilized, which often causes false background regions and blurred object boundaries. Motivated by these issues, we propose an easyto-implement module that utilizes the edge-preserving ability of superpixels and the graph neural network to interact the context of superpixel nodes. In more detail, we first extract the features from the backbone network and obtain the superpixel information of images. This step is followed by superpixel pooling in which we transfer the irregular superpixel information to a structured feature representation. To propagate the information among the foreground and background regions, we use a graph neural network and self-attention layer to better evaluate the degree of saliency degree. Additionally, an affinity loss is proposed to regularize the affinity matrix to constrain the propagation path. Moreover, we extend our module to a multiscale structure with different numbers of superpixels. Experiments on five challenging datasets show that our approach can improve the performance of three baseline methods in terms of some popular evaluation metrics.
最近,大量现有的显著性检测方法主要集中在设计复杂的网络架构,以聚合来自主干网络的强大特征。然而,上下文信息没有得到很好的利用,这经常导致虚假的背景区域和模糊的物体边界。受这些问题的启发,我们提出了一个易于实现的模块,该模块利用超像素的边缘保留能力和图神经网络来交互超像素节点的上下文。更详细地说,我们首先从主干网络中提取特征,并获得图像的超像素信息。此步骤之后是超像素池化,在其中我们将不规则的超像素信息转换为结构化特征表示。为了在前景和背景区域之间传播信息,我们使用图神经网络和自注意力层来更好地评估显著性程度。此外,还提出了一种亲和损失来正则化亲和矩阵,以约束传播路径。此外,我们将我们的模块扩展为具有不同数量超像素的多尺度结构。在五个具有挑战性的数据集上进行的实验表明,我们的方法在一些流行的评估指标方面可以提高三种基线方法的性能。