Suppr超能文献

基于多尺度深度卷积神经网络特征的显著目标检测

Visual Saliency Detection Based on Multiscale Deep CNN Features.

出版信息

IEEE Trans Image Process. 2016 Nov;25(11):5012-5024. doi: 10.1109/TIP.2016.2602079. Epub 2016 Aug 24.

Abstract

Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving the state-of-the-art performance on all public benchmarks, improving the F-measure by 6.12% and 10%, respectively, on the DUT-OMRON data set and our new data set (HKU-IS), and lowering the mean absolute error by 9% and 35.3%, respectively, on these two data sets.

摘要

视觉显著度是认知和计算科学中的一个基本问题,包括计算机视觉。在本文中,我们发现可以从使用深度卷积神经网络(CNN)提取的多尺度特征中学习到高质量的视觉显著度模型,这些特征在视觉识别任务中取得了许多成功。为了学习这种显著度模型,我们引入了一种神经网络架构,该架构在负责三个不同尺度特征提取的 CNN 之上具有全连接层。我们的神经网络的倒数第二层被证实是用于显著度检测的有区分力的高级特征向量,我们称之为深度对比特征。为了生成更稳健的特征,我们将手工制作的低级特征与我们的深度对比特征集成在一起。为了促进视觉显著度模型的进一步研究和评估,我们还构建了一个新的包含 4447 张具有挑战性图像及其像素级显著度标注的大型数据库。实验结果表明,我们提出的方法能够在所有公共基准上实现最先进的性能,在 DUT-OMRON 数据集和我们的新数据集(HKU-IS)上分别将 F-measure 提高了 6.12%和 10%,并将这两个数据集上的平均绝对误差分别降低了 9%和 35.3%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验