Erdem Erkut, Erdem Aykut
Department of Computer Engineering, Hacettepe University, Ankara, Turkey.
J Vis. 2013 Mar 18;13(4):11. doi: 10.1167/13.4.11.
To detect visually salient elements of complex natural scenes, computational bottom-up saliency models commonly examine several feature channels such as color and orientation in parallel. They compute a separate feature map for each channel and then linearly combine these maps to produce a master saliency map. However, only a few studies have investigated how different feature dimensions contribute to the overall visual saliency. We address this integration issue and propose to use covariance matrices of simple image features (known as region covariance descriptors in the computer vision community; Tuzel, Porikli, & Meer, 2006) as meta-features for saliency estimation. As low-dimensional representations of image patches, region covariances capture local image structures better than standard linear filters, but more importantly, they naturally provide nonlinear integration of different features by modeling their correlations. We also show that first-order statistics of features could be easily incorporated to the proposed approach to improve the performance. Our experimental evaluation on several benchmark data sets demonstrate that the proposed approach outperforms the state-of-art models on various tasks including prediction of human eye fixations, salient object detection, and image-retargeting.
为了检测复杂自然场景中视觉上显著的元素,计算自下而上的显著性模型通常会并行检查多个特征通道,如颜色和方向。它们为每个通道计算一个单独的特征图,然后将这些图进行线性组合以生成一个主显著性图。然而,只有少数研究探讨了不同特征维度如何对整体视觉显著性产生影响。我们解决了这个整合问题,并建议使用简单图像特征的协方差矩阵(在计算机视觉领域称为区域协方差描述符;图泽尔、波里克利和米尔,2006年)作为显著性估计的元特征。作为图像块的低维表示,区域协方差比标准线性滤波器能更好地捕捉局部图像结构,但更重要的是,它们通过对不同特征的相关性进行建模,自然地提供了不同特征的非线性整合。我们还表明,特征的一阶统计量可以很容易地纳入到所提出的方法中以提高性能。我们在几个基准数据集上的实验评估表明,所提出的方法在包括人眼注视预测、显著目标检测和图像重定目标等各种任务上优于当前的先进模型。