Department of Computer Science, University of Maryland, College Park, MD 20742, USA.
IEEE Trans Pattern Anal Mach Intell. 2012 Apr;34(4):639-53. doi: 10.1109/TPAMI.2011.171.
Attention is an integral part of the human visual system and has been widely studied in the visual attention literature. The human eyes fixate at important locations in the scene, and every fixation point lies inside a particular region of arbitrary shape and size, which can either be an entire object or a part of it. Using that fixation point as an identification marker on the object, we propose a method to segment the object of interest by finding the "optimal" closed contour around the fixation point in the polar space, avoiding the perennial problem of scale in the Cartesian space. The proposed segmentation process is carried out in two separate steps: First, all visual cues are combined to generate the probabilistic boundary edge map of the scene; second, in this edge map, the "optimal" closed contour around a given fixation point is found. Having two separate steps also makes it possible to establish a simple feedback between the mid-level cue (regions) and the low-level visual cues (edges). In fact, we propose a segmentation refinement process based on such a feedback process. Finally, our experiments show the promise of the proposed method as an automatic segmentation framework for a general purpose visual system.
注意力是人类视觉系统的一个组成部分,在视觉注意力文献中得到了广泛的研究。人的眼睛会在场景中的重要位置进行注视,每个注视点都位于特定的任意形状和大小的区域内,这个区域可以是整个物体,也可以是物体的一部分。我们使用该注视点作为物体上的识别标记,提出了一种通过在极坐标空间中找到注视点周围的“最佳”封闭轮廓来分割感兴趣物体的方法,从而避免了笛卡尔坐标空间中普遍存在的尺度问题。所提出的分割过程分两个独立的步骤进行:首先,组合所有视觉线索生成场景的概率边界边缘图;其次,在该边缘图中找到给定注视点的“最佳”封闭轮廓。这种两步法还可以在中级线索(区域)和低级视觉线索(边缘)之间建立简单的反馈。事实上,我们提出了一种基于这种反馈过程的分割细化过程。最后,我们的实验表明,该方法有望成为通用视觉系统的自动分割框架。