Zhang Xiaoning, Yu Yi, Wang Yuqing, Chen Xiaolin, Wang Chenglong
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.
University of Chinese Academy of Sciences, Beijing 100049, China.
Sensors (Basel). 2023 Jul 20;23(14):6562. doi: 10.3390/s23146562.
Salient object detection has made substantial progress due to the exploitation of multi-level convolutional features. The key point is how to combine these convolutional features effectively and efficiently. Due to the step by step down-sampling operations in almost all CNNs, multi-level features usually have different scales. Methods based on fully convolutional networks directly apply bilinear up-sampling to low-resolution deep features and then combine them with high-resolution shallow features by addition or concatenation, which neglects the compatibility of features, resulting in misalignment problems. In this paper, to solve the problem, we propose an alignment integration network (ALNet), which aligns adjacent level features progressively to generate powerful combinations. To capture long-range dependencies for high-level integrated features as well as maintain high computational efficiency, a strip attention module (SAM) is introduced into the alignment integration procedures. Benefiting from SAM, multi-level semantics can be selectively propagated to predict precise salient objects. Furthermore, although integrating multi-level convolutional features can alleviate the blur boundary problem to a certain extent, it is still unsatisfactory for the restoration of a real object boundary. Therefore, we design a simple but effective boundary enhancement module (BEM) to guide the network focus on boundaries and other error-prone parts. Based on BEM, an attention weighted loss is proposed to boost the network to generate sharper object boundaries. Experimental results on five benchmark datasets demonstrate that the proposed method can achieve state-of-the-art performance on salient object detection. Moreover, we extend the experiments on the remote sensing datasets, and the results further prove the universality and scalability of ALNet.
由于对多级卷积特征的利用,显著目标检测取得了实质性进展。关键在于如何有效且高效地组合这些卷积特征。由于几乎所有卷积神经网络(CNN)中都存在逐步下采样操作,多级特征通常具有不同的尺度。基于全卷积网络的方法直接对低分辨率的深层特征应用双线性上采样,然后通过相加或拼接将其与高分辨率的浅层特征相结合,这忽略了特征的兼容性,导致对齐问题。在本文中,为了解决这个问题,我们提出了一种对齐集成网络(ALNet),它逐步对齐相邻层级的特征以生成强大的组合。为了捕获高级集成特征的长距离依赖性并保持高计算效率,在对齐集成过程中引入了带状注意力模块(SAM)。受益于SAM,可以选择性地传播多级语义以预测精确的显著目标。此外,尽管集成多级卷积特征可以在一定程度上缓解模糊边界问题,但对于真实物体边界的恢复仍然不尽人意。因此,我们设计了一个简单但有效的边界增强模块(BEM)来引导网络关注边界和其他容易出错的部分。基于BEM,提出了一种注意力加权损失,以促使网络生成更清晰的物体边界。在五个基准数据集上的实验结果表明,所提出的方法在显著目标检测方面可以达到当前最优性能。此外,我们在遥感数据集上扩展了实验,结果进一步证明了ALNet的通用性和可扩展性。