Li Zun, Lang Congyan, Liew Jun Hao, Li Yidong, Hou Qibin, Feng Jiashi
IEEE Trans Image Process. 2021;30:4587-4598. doi: 10.1109/TIP.2021.3072811. Epub 2021 Apr 29.
Feature pyramid network (FPN) based models, which fuse the semantics and salient details in a progressive manner, have been proven highly effective in salient object detection. However, it is observed that these models often generate saliency maps with incomplete object structures or unclear object boundaries, due to the indirect information propagation among distant layers that makes such fusion structure less effective. In this work, we propose a novel Cross-layer Feature Pyramid Network (CFPN), in which direct cross-layer communication is enabled to improve the progressive fusion in salient object detection. Specifically, the proposed network first aggregates multi-scale features from different layers into feature maps that have access to both the high- and low- level information. Then, it distributes the aggregated features to all the involved layers to gain access to richer context. In this way, the distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information during the progressive feature fusion. At last, CFPN fuses the distributed features of each layer stage-by-stage. This way, the high-level features that contain context useful for locating complete objects are preserved until the final output layer, and the low-level features that contain spatial structure details are embedded into each layer to preserve spatial structural details. Extensive experimental results over six widely used salient object detection benchmarks and with three popular backbones clearly demonstrate that CFPN can accurately locate fairly complete salient regions and effectively segment the object boundaries.
基于特征金字塔网络(FPN)的模型以渐进方式融合语义和显著细节,已被证明在显著目标检测中非常有效。然而,据观察,这些模型经常生成具有不完整对象结构或不清晰对象边界的显著性图,这是由于远距离层之间的间接信息传播使得这种融合结构效果较差。在这项工作中,我们提出了一种新颖的跨层特征金字塔网络(CFPN),其中启用了直接跨层通信以改善显著目标检测中的渐进融合。具体而言,所提出的网络首先将来自不同层的多尺度特征聚合到能够访问高级和低级信息的特征图中。然后,它将聚合后的特征分配到所有相关层以获取更丰富的上下文信息。通过这种方式,每层的分布式特征同时拥有来自所有其他层的语义和显著细节,并且在渐进特征融合过程中重要信息的损失减少。最后,CFPN逐阶段融合每层的分布式特征。这样,包含有助于定位完整对象的上下文的高级特征被保留到最终输出层,而包含空间结构细节的低级特征被嵌入到每层以保留空间结构细节。在六个广泛使用的显著目标检测基准上以及使用三种流行主干网络进行的大量实验结果清楚地表明,CFPN能够准确地定位相当完整的显著区域并有效地分割对象边界。