IEEE Trans Neural Netw Learn Syst. 2016 Jun;27(6):1135-49. doi: 10.1109/TNNLS.2015.2506664. Epub 2016 Jan 5.
Salient object detection increasingly receives attention as an important component or step in several pattern recognition and image processing tasks. Although a variety of powerful saliency models have been intensively proposed, they usually involve heavy feature (or model) engineering based on priors (or assumptions) about the properties of objects and backgrounds. Inspired by the effectiveness of recently developed feature learning, we provide a novel deep image saliency computing (DISC) framework for fine-grained image saliency computing. In particular, we model the image saliency from both the coarse-and fine-level observations, and utilize the deep convolutional neural network (CNN) to learn the saliency representation in a progressive manner. In particular, our saliency model is built upon two stacked CNNs. The first CNN generates a coarse-level saliency map by taking the overall image as the input, roughly identifying saliency regions in the global context. Furthermore, we integrate superpixel-based local context information in the first CNN to refine the coarse-level saliency map. Guided by the coarse saliency map, the second CNN focuses on the local context to produce fine-grained and accurate saliency map while preserving object details. For a testing image, the two CNNs collaboratively conduct the saliency computing in one shot. Our DISC framework is capable of uniformly highlighting the objects of interest from complex background while preserving well object details. Extensive experiments on several standard benchmarks suggest that DISC outperforms other state-of-the-art methods and it also generalizes well across data sets without additional training. The executable version of DISC is available online: http://vision.sysu.edu.cn/projects/DISC.
显著目标检测作为多个模式识别和图像处理任务中的一个重要组成部分或步骤,越来越受到关注。尽管已经提出了多种强大的显著模型,但它们通常涉及基于对象和背景特性的先验(或假设)的繁重特征(或模型)工程。受最近开发的特征学习的有效性的启发,我们提供了一种新颖的深度图像显著计算(DISC)框架,用于细粒度的图像显著计算。具体来说,我们从粗级和细级观测中建模图像显著,并利用深度卷积神经网络(CNN)以渐进的方式学习显著表示。具体来说,我们的显著模型是基于两个堆叠的 CNN 构建的。第一个 CNN 通过将整个图像作为输入生成一个粗级显著图,大致在全局上下文中识别出显著区域。此外,我们在第一个 CNN 中集成了基于超像素的局部上下文信息,以细化粗级显著图。在粗级显著图的指导下,第二个 CNN 专注于局部上下文,在保留对象细节的同时生成细粒度和准确的显著图。对于测试图像,两个 CNN 协同进行一次显著计算。我们的 DISC 框架能够均匀地突出复杂背景中的目标,同时很好地保留对象细节。在几个标准基准上的广泛实验表明,DISC 优于其他最先进的方法,并且在不需要额外训练的情况下也可以很好地推广到不同的数据集。DISC 的可执行版本可在线获得:http://vision.sysu.edu.cn/projects/DISC。