Department of Electronic Engineering, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, China.
Data61, CSIRO, and Australian National University, Canberra, ACT, Australia.
IEEE Trans Image Process. 2018;27(1):121-134. doi: 10.1109/TIP.2017.2756825.
In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection. The proposed framework is aiming to address two limits of the existing CNN based methods. First, region-based CNN methods lack sufficient context to accurately locate salient object since they deal with each region independently. Second, pixel-based CNN methods suffer from blurry boundaries due to the presence of convolutional and pooling layers. Motivated by these, we first propose an end-to-end edge-preserved neural network based on Fast R-CNN framework (named ) to efficiently generate saliency map with sharp object boundaries. Later, to further improve it, multi-scale spatial context is attached to to consider the relationship between regions and the global scenes. Furthermore, our method can be generally applied to RGB-D saliency detection by depth refinement. The proposed framework achieves both clear detection boundary and multi-scale contextual robustness simultaneously for the first time, and thus achieves an optimized performance. Experiments on six RGB and two RGB-D benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance.In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection. The proposed framework is aiming to address two limits of the existing CNN based methods. First, region-based CNN methods lack sufficient context to accurately locate salient object since they deal with each region independently. Second, pixel-based CNN methods suffer from blurry boundaries due to the presence of convolutional and pooling layers. Motivated by these, we first propose an end-to-end edge-preserved neural network based on Fast R-CNN framework (named ) to efficiently generate saliency map with sharp object boundaries. Later, to further improve it, multi-scale spatial context is attached to to consider the relationship between regions and the global scenes. Furthermore, our method can be generally applied to RGB-D saliency detection by depth refinement. The proposed framework achieves both clear detection boundary and multi-scale contextual robustness simultaneously for the first time, and thus achieves an optimized performance. Experiments on six RGB and two RGB-D benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance.
本文提出了一种新颖的边缘保持和多尺度上下文神经网络,用于显著目标检测。所提出的框架旨在解决现有基于 CNN 方法的两个局限性。首先,基于区域的 CNN 方法由于独立处理每个区域,因此缺乏足够的上下文来准确定位显著目标。其次,基于像素的 CNN 方法由于卷积和池化层的存在而受到模糊边界的影响。受此启发,我们首先提出了一种基于 Fast R-CNN 框架的端到端边缘保留神经网络(命名为),以有效地生成具有清晰对象边界的显著图。后来,为了进一步提高它,多尺度空间上下文被附加到 以考虑区域之间和全局场景之间的关系。此外,我们的方法可以通过深度细化一般应用于 RGB-D 显著检测。该框架首次同时实现了清晰的检测边界和多尺度上下文鲁棒性,从而实现了优化的性能。在六个 RGB 和两个 RGB-D 基准数据集上的实验表明,所提出的方法达到了最先进的性能。