Wilder Matthew H, Mozer Michael C, Wickens Christopher D
Department of Computer Science, University of Colorado, 4595 Brookfield Dr., Boulder, CO 80309, USA.
J Vis. 2011 Feb 9;11(2):8. doi: 10.1167/11.2.8.
Although diverse, theories of visual attention generally share the notion that attention is controlled by some combination of three distinct strategies: (1) exogenous cuing from locally contrasting primitive visual features, such as abrupt onsets or color singletons (e.g., L. Itti, C. Koch, & E. Neiber, 1998), (2) endogenous gain modulation of exogenous activations, used to guide attention to task-relevant features (e.g., V. Navalpakkam & L. Itti, 2007; J. Wolfe, 1994, 2007), and (3) endogenous prediction of likely locations of interest, based on task and scene gist (e.g., A. Torralba, A. Oliva, M. Castelhano, & J. Henderson, 2006). However, little work has been done to synthesize these disparate theories. In this work, we propose a unifying conceptualization in which attention is controlled along two dimensions: the degree of task focus and the contextual scale of operation. Previously proposed strategies-and their combinations-can be viewed as instances of this one mechanism. Thus, this theory serves not as a replacement for existing models but as a means of bringing them into a coherent framework. We present an implementation of this theory and demonstrate its applicability to a wide range of attentional phenomena. The model accounts for key results in visual search with synthetic images and makes reasonable predictions for human eye movements in search tasks involving real-world images. In addition, the theory offers an unusual perspective on attention that places a fundamental emphasis on the role of experience and task-related knowledge.
尽管视觉注意理论多种多样,但它们通常都有这样一个观点,即注意是由三种不同策略的某种组合来控制的:(1)来自局部对比的原始视觉特征的外源性线索,如突然出现或颜色单一特征(例如,L.伊提、C.科赫和E.奈伯,1998年);(2)对外源性激活的内源性增益调制,用于将注意力引导到与任务相关的特征上(例如,V.纳瓦尔帕卡姆和L.伊提,2007年;J.沃尔夫,1994年、2007年);以及(3)基于任务和场景要点的对可能感兴趣位置的内源性预测(例如,A.托拉尔巴、A.奥利瓦、M.卡斯特利亚诺和J.亨德森,2006年)。然而,将这些不同的理论进行综合的工作做得很少。在这项工作中,我们提出了一种统一的概念化方法,其中注意是沿着两个维度来控制的:任务聚焦程度和操作的上下文尺度。先前提出的策略及其组合可以被视为这一机制的实例。因此,该理论并非是对现有模型的替代,而是将它们纳入一个连贯框架的一种手段。我们展示了该理论的一种实现方式,并证明了其对广泛的注意现象的适用性。该模型解释了合成图像视觉搜索中的关键结果,并对涉及真实世界图像的搜索任务中的人眼运动做出了合理预测。此外,该理论为注意提供了一个不同寻常的视角,它将重点基本放在了经验和与任务相关的知识的作用上。