Suppr超能文献

什么是显著目标?显著目标检测的数据集和基准模型。

What is a salient object? A dataset and a baseline model for salient object detection.

出版信息

IEEE Trans Image Process. 2015 Feb;24(2):742-56. doi: 10.1109/TIP.2014.2383320.

Abstract

Salient object detection or salient region detection models, diverging from fixation prediction models, have traditionally been dealing with locating and segmenting the most salient object or region in a scene. While the notion of most salient object is sensible when multiple objects exist in a scene, current datasets for evaluation of saliency detection approaches often have scenes with only one single object. We introduce three main contributions in this paper. First, we take an in-depth look at the problem of salient object detection by studying the relationship between where people look in scenes and what they choose as the most salient object when they are explicitly asked. Based on the agreement between fixations and saliency judgments, we then suggest that the most salient object is the one that attracts the highest fraction of fixations. Second, we provide two new less biased benchmark data sets containing scenes with multiple objects that challenge existing saliency models. Indeed, we observed a severe drop in performance of eight state-of-the-art models on our data sets (40%-70%). Third, we propose a very simple yet powerful model based on superpixels to be used as a baseline for model evaluation and comparison. While on par with the best models on MSRA-5 K data set, our model wins over other models on our data highlighting a serious drawback of existing models, which is convoluting the processes of locating the most salient object and its segmentation. We also provide a review and statistical analysis of some labeled scene data sets that can be used for evaluating salient object detection models. We believe that our work can greatly help remedy the over-fitting of models to existing biased data sets and opens new venues for future research in this fast-evolving field.

摘要

显著目标检测或显著区域检测模型与注视预测模型不同,传统上一直致力于定位和分割场景中最显著的目标或区域。虽然在场景中有多个目标时,最显著目标的概念是合理的,但当前用于评估显著检测方法的数据集通常只有一个单一的目标。本文主要有三个贡献。首先,我们深入研究了显著目标检测问题,研究了人们在场景中注视的位置与他们在被明确询问时选择的最显著目标之间的关系。基于注视和显著度判断之间的一致性,我们提出最显著的目标是吸引最多注视的那个目标。其次,我们提供了两个新的、偏差较小的基准数据集,其中包含多个目标的场景,这对现有的显著模型提出了挑战。实际上,我们观察到八个最先进的模型在我们的数据集上的性能严重下降(40%-70%)。第三,我们提出了一个基于超像素的非常简单但强大的模型,作为模型评估和比较的基准。虽然在 MSRA-5K 数据集上与最好的模型相当,但我们的模型在我们的数据集上胜过其他模型,突出了现有模型的一个严重缺陷,即混淆了定位最显著目标及其分割的过程。我们还对一些可用于评估显著目标检测模型的有标签场景数据集进行了回顾和统计分析。我们相信,我们的工作可以大大帮助纠正模型对现有有偏差数据集的过度拟合,并为该快速发展领域的未来研究开辟新的途径。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验