Zhu Ge, Li Jinbao, Guo Yahong
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6615-6627. doi: 10.1109/TNNLS.2021.3127959. Epub 2023 Sep 1.
Current methods aggregate multilevel features from the backbone and introduce edge information to get more refined saliency maps. However, little attention is paid to how to suppress the regions with similar saliency appearances in the background. These regions usually exist in the vicinity of salient objects and have high contrast with the background, which is easy to be misclassified as foreground. To solve this problem, we propose a gated feature interaction network (GFINet) to integrate multiple saliency features, which can utilize nonboundary features with background information to suppress pseudosalient objects and simultaneously apply boundary features to supplement edge details. Different from previous methods that only consider the complementarity between saliency and boundary, the proposed network introduces nonboundary features into the decoder to filter the pseudosalient objects. Specifically, GFINet consists of global features aggregation branch (GFAB), boundary and nonboundary features' perception branch (B&NFPB), and gated feature interaction module (GFIM). According to the global features generated by GFAB, boundary and nonboundary features produced by B&NFPB and GFIM employ a gate structure to adaptively optimize the saliency information interchange between abovementioned features and, thus, predict the final saliency maps. Besides, due to the imbalanced distribution between the boundary pixels and nonboundary ones, the binary cross-entropy (BCE) loss is difficult to predict the pixels near the boundary. Therefore, we design a border region aware (BRA) loss to further boost the quality of boundary and nonboundary, which can guide the network to focus more on the hard pixels near the boundary by assigning different weights to different positions. Compared with 12 counterparts, experimental results on five benchmark datasets show that our method has better generalization and improves the state-of-the-art approach by 4.85% averagely in terms of the regional and boundary evaluation measures. In addition, our model is more efficient with an inference speed of 50.3 FPS when processing a 320 ×320 image. Code has been made available at https://github.com/lesonly/GFINet.
当前的方法聚合来自主干网络的多级特征,并引入边缘信息以获得更精细的显著性图。然而,对于如何抑制背景中具有相似显著性外观的区域却很少关注。这些区域通常存在于显著物体附近,并且与背景具有高对比度,这很容易被误分类为前景。为了解决这个问题,我们提出了一种门控特征交互网络(GFINet)来整合多个显著性特征,它可以利用具有背景信息的非边界特征来抑制伪显著物体,同时应用边界特征来补充边缘细节。与以往仅考虑显著性和边界之间互补性的方法不同,所提出的网络将非边界特征引入解码器以过滤伪显著物体。具体来说,GFINet由全局特征聚合分支(GFAB)、边界和非边界特征感知分支(B&NFPB)以及门控特征交互模块(GFIM)组成。根据GFAB生成的全局特征,B&NFPB和GFIM产生的边界和非边界特征采用门控结构来自适应地优化上述特征之间的显著性信息交换,从而预测最终的显著性图。此外,由于边界像素和非边界像素之间的分布不均衡,二元交叉熵(BCE)损失难以预测边界附近的像素。因此,我们设计了一种边界区域感知(BRA)损失来进一步提升边界和非边界的质量,它可以通过给不同位置分配不同权重来引导网络更多地关注边界附近的难处理像素。与12种同类方法相比,在五个基准数据集上的实验结果表明,我们的方法具有更好的泛化能力,并且在区域和边界评估指标方面平均比当前最优方法提高了4.85%。此外,我们的模型在处理320×320图像时推理速度为50.3 FPS,效率更高。代码已在https://github.com/lesonly/GFINet上提供。