Yanulevskaya Victoria, Uijlings Jasper, Geusebroek Jan-Mark, Sebe Nicu, Smeulders Arnold
Department of Information Engineering and Computer Science, University of Trento, Italy.
J Vis. 2013 Nov 26;13(13):27. doi: 10.1167/13.13.27.
State-of-the-art bottom-up saliency models often assign high saliency values at or near high-contrast edges, whereas people tend to look within the regions delineated by those edges, namely the objects. To resolve this inconsistency, in this work we estimate saliency at the level of coherent image regions. According to object-based attention theory, the human brain groups similar pixels into coherent regions, which are called proto-objects. The saliency of these proto-objects is estimated and incorporated together. As usual, attention is given to the most salient image regions. In this paper we employ state-of-the-art computer vision techniques to implement a proto-object-based model for visual attention. Particularly, a hierarchical image segmentation algorithm is used to extract proto-objects. The two most powerful ways to estimate saliency, rarity-based and contrast-based saliency, are generalized to assess the saliency at the proto-object level. The rarity-based saliency assesses if the proto-object contains rare or outstanding details. The contrast-based saliency estimates how much the proto-object differs from the surroundings. However, not all image regions with high contrast to the surroundings attract human attention. We take this into account by distinguishing between external and internal contrast-based saliency. Where the external contrast-based saliency estimates the difference between the proto-object and the rest of the image, the internal contrast-based saliency estimates the complexity of the proto-object itself. We evaluate the performance of the proposed method and its components on two challenging eye-fixation datasets (Judd, Ehinger, Durand, & Torralba, 2009; Subramanian, Katti, Sebe, Kankanhalli, & Chua, 2010). The results show the importance of rarity-based and both external and internal contrast-based saliency in fixation prediction. Moreover, the comparison with state-of-the-art computational models for visual saliency demonstrates the advantage of proto-objects as units of analysis.
最先进的自下而上显著度模型通常会在高对比度边缘处或其附近赋予高显著度值,而人们往往会看向这些边缘所划定的区域内,也就是物体。为了解决这种不一致性,在这项工作中,我们在连贯图像区域层面估计显著度。根据基于物体的注意力理论,人类大脑将相似像素分组为连贯区域,这些区域被称为原始物体。我们估计并综合这些原始物体的显著度。和往常一样,注意力集中在最显著的图像区域。在本文中,我们采用最先进的计算机视觉技术来实现一个基于原始物体的视觉注意力模型。特别地,我们使用一种分层图像分割算法来提取原始物体。估计显著度的两种最有效的方法,即基于稀有性的显著度和基于对比度的显著度,被推广到在原始物体层面评估显著度。基于稀有性的显著度评估原始物体是否包含稀有或突出的细节。基于对比度的显著度估计原始物体与周围环境的差异程度。然而,并非所有与周围环境有高对比度的图像区域都会吸引人类注意力。我们通过区分基于外部对比度的显著度和基于内部对比度的显著度来考虑这一点。基于外部对比度的显著度估计原始物体与图像其余部分的差异,而基于内部对比度的显著度估计原始物体本身的复杂性。我们在两个具有挑战性的眼动注视数据集(Judd、Ehinger、Durand和Torralba,2009年;Subramanian、Katti、Sebe、Kankanhalli和Chua,2010年)上评估了所提出方法及其组件 的性能。结果表明基于稀有性的显著度以及基于外部和内部对比度的显著度在注视预测中的重要性。此外,与最先进的视觉显著度计算模型的比较证明了将原始物体作为分析单元的优势。