Suppr超能文献

面部和文本吸引注视,与任务无关:实验数据与计算机模型

Faces and text attract gaze independent of the task: Experimental data and computer model.

作者信息

Cerf Moran, Frady E Paxon, Koch Christof

机构信息

Computation and Neural Systems, California Institute of Technology, Pasadena, CA, USA.

出版信息

J Vis. 2009 Nov 18;9(12):10.1-15. doi: 10.1167/9.12.10.

Abstract

Previous studies of eye gaze have shown that when looking at images containing human faces, observers tend to rapidly focus on the facial regions. But is this true of other high-level image features as well? We here investigate the extent to which natural scenes containing faces, text elements, and cell phones-as a suitable control-attract attention by tracking the eye movements of subjects in two types of tasks-free viewing and search. We observed that subjects in free-viewing conditions look at faces and text 16.6 and 11.1 times more than similar regions normalized for size and position of the face and text. In terms of attracting gaze, text is almost as effective as faces. Furthermore, it is difficult to avoid looking at faces and text even when doing so imposes a cost. We also found that subjects took longer in making their initial saccade when they were told to avoid faces/text and their saccades landed on a non-face/non-text object. We refine a well-known bottom-up computer model of saliency-driven attention that includes conspicuity maps for color, orientation, and intensity by adding high-level semantic information (i.e., the location of faces or text) and demonstrate that this significantly improves the ability to predict eye fixations in natural images. Our enhanced model's predictions yield an area under the ROC curve over 84% for images that contain faces or text when compared against the actual fixation pattern of subjects. This suggests that the primate visual system allocates attention using such an enhanced saliency map.

摘要

以往关于目光注视的研究表明,当观察包含人脸的图像时,观察者往往会迅速将注意力集中在面部区域。但对于其他高级图像特征而言也是如此吗?我们在此通过跟踪受试者在两种任务(自由观看和搜索)中的眼动情况,来研究包含人脸、文本元素以及手机(作为合适的对照物)的自然场景在多大程度上会吸引注意力。我们观察到,在自由观看条件下,受试者看向人脸和文本的次数分别比根据人脸和文本的大小及位置进行归一化处理后的相似区域多16.6倍和11.1倍。就吸引目光而言,文本几乎与人脸一样有效。此外,即使这样做会付出代价,也很难避免看向人脸和文本。我们还发现,当受试者被告知要避免看向人脸/文本且他们的扫视落在非人脸/非文本物体上时,他们做出初始扫视的时间会更长。我们通过添加高级语义信息(即人脸或文本的位置),对一个著名的自下而上的显著性驱动注意力计算机模型进行了改进,该模型包括颜色、方向和强度的显著度图,并证明这显著提高了预测自然图像中眼注视点的能力。与受试者的实际注视模式相比,我们增强后的模型对于包含人脸或文本的图像,其预测在ROC曲线下的面积超过84%。这表明灵长类视觉系统使用这样一种增强的显著性图来分配注意力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验