900个场景中的人员搜索建模：一种眼睛注视的组合源模型。

Modeling Search for People in 900 Scenes: A combined source model of eye guidance.

作者信息

Ehinger Krista A, Hidalgo-Sotelo Barbara, Torralba Antonio, Oliva Aude

机构信息

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology.

出版信息

Vis cogn. 2009 Aug 1;17(6-7):945-978. doi: 10.1080/13506280902834720.

DOI:10.1080/13506280902834720

PMID:20011676

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2790194/

Abstract

How predictable are human eye movements during search in real world scenes? We recorded 14 observers' eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94% of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in modeling should capture mechanisms underlying the selectivity of observer's fixations during search.

摘要

在现实世界场景中进行搜索时，人类的眼球运动有多可预测？我们记录了14名观察者在912个户外场景中执行搜索任务（人物检测）时的眼球运动。即使场景中没有目标，观察者在搜索过程中注视的区域也高度一致。这些眼球运动被用于评估来自三个来源的搜索引导计算模型：显著性、目标特征和场景上下文。在预测人类注视点方面，这些模型中的每一个都独立地优于跨图像控制模型。结合引导来源的模型最终预测出了94%的人类一致性，其中场景上下文部分提供了最大的解释力。然而，没有一个模型能够达到由人类注视定义的注意力地图的精度和保真度。这项工作为现实世界场景中的搜索计算模型提出了一个基准。建模方面的进一步改进应该捕捉到观察者在搜索过程中注视选择性背后的机制。

相似文献

Modeling Search for People in 900 Scenes: A combined source model of eye guidance.

Vis cogn. 2009 Aug 1;17(6-7):945-978. doi: 10.1080/13506280902834720.

Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search.

Psychol Rev. 2006 Oct;113(4):766-86. doi: 10.1037/0033-295X.113.4.766.

Semantic guidance of eye movements in real-world scenes.

Vision Res. 2011 May 25;51(10):1192-205. doi: 10.1016/j.visres.2011.03.010. Epub 2011 Mar 21.

Complementary effects of gaze direction and early saliency in guiding fixations during free viewing.

J Vis. 2014 Nov 4;14(13):3. doi: 10.1167/14.13.3.

What do saliency models predict?

J Vis. 2014 Mar 11;14(3):14. doi: 10.1167/14.3.14.

Attentional capture is contingent on scene region: Using surface guidance framework to explore attentional mechanisms during search.

Psychon Bull Rev. 2019 Aug;26(4):1273-1281. doi: 10.3758/s13423-019-01610-z.

Meaning and Attentional Guidance in Scenes: A Review of the Meaning Map Approach.

Vision (Basel). 2019 May 10;3(2):19. doi: 10.3390/vision3020019.

Augmented saliency model using automatic 3D head pose detection and learned gaze following in natural scenes.

Vision Res. 2015 Nov;116(Pt B):113-26. doi: 10.1016/j.visres.2014.10.027. Epub 2014 Nov 13.

A model of top-down attentional control during visual search in complex scenes.

J Vis. 2009 May 27;9(5):25.1-18. doi: 10.1167/9.5.25.

Oculomotor capture during real-world scene viewing depends on cognitive load.

Vision Res. 2011 Mar 25;51(6):546-52. doi: 10.1016/j.visres.2011.01.014. Epub 2011 Feb 15.

引用本文的文献

Human-like scene interpretation by a guided counterstream processing.

Proc Natl Acad Sci U S A. 2023 Oct 3;120(40):e2211179120. doi: 10.1073/pnas.2211179120. Epub 2023 Sep 28.

Visual search for reach targets in actionable space is influenced by movement costs imposed by obstacles.

J Vis. 2023 Jun 1;23(6):4. doi: 10.1167/jov.23.6.4.

Assessing the allocation of attention during visual search using digit-tracking, a calibration-free alternative to eye tracking.

Sci Rep. 2023 Feb 9;13(1):2376. doi: 10.1038/s41598-023-29133-7.

Weighting the factors affecting attention guidance during free viewing and visual search: The unexpected role of object recognition uncertainty.

J Vis. 2022 Mar 2;22(4):13. doi: 10.1167/jov.22.4.13.

Magnetic resonance-based eye tracking using deep neural networks.

Nat Neurosci. 2021 Dec;24(12):1772-1779. doi: 10.1038/s41593-021-00947-w. Epub 2021 Nov 8.

Meaning and expected surfaces combine to guide attention during visual search in scenes.

J Vis. 2021 Oct 5;21(11):1. doi: 10.1167/jov.21.11.1.

Saliency-Based Gaze Visualization for Eye Movement Analysis.

Sensors (Basel). 2021 Jul 30;21(15):5178. doi: 10.3390/s21155178.

Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning.

Neuron Behav Data Anal Theory. 2021;2021. doi: 10.51628/001c.22322. Epub 2021 Apr 20.

Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning.

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2020 Jun;2020:190-199. doi: 10.1109/cvpr42600.2020.00027. Epub 2020 Aug 5.

Baseline Differences in Anxiety Affect Attention and tDCS-Mediated Learning.

Front Hum Neurosci. 2021 Mar 3;15:541369. doi: 10.3389/fnhum.2021.541369. eCollection 2021.

本文引用的文献

Guided Search 2.0 A revised model of visual search.

Psychon Bull Rev. 1994 Jun;1(2):202-38. doi: 10.3758/BF03200774.

ARTSCENE: A neural system for natural scene classification.

J Vis. 2009 Apr 6;9(4):6.1-19. doi: 10.1167/9.4.6.

A theory of eye movements during target acquisition.

Psychol Rev. 2008 Oct;115(4):787-835. doi: 10.1037/a0013118.

80 million tiny images: a large data set for nonparametric object and scene recognition.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1958-70. doi: 10.1109/TPAMI.2008.128.

Recognition of natural scenes from global properties: seeing the forest without representing the trees.

Cogn Psychol. 2009 Mar;58(2):137-76. doi: 10.1016/j.cogpsych.2008.06.001. Epub 2008 Aug 30.

Unconscious associative memory affects visual processing before 100 ms.

J Vis. 2008 Mar 12;8(3):10.1-10. doi: 10.1167/8.3.10.

Interesting objects are visually salient.

J Vis. 2008 Mar 7;8(3):3.1-15. doi: 10.1167/8.3.3.

What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition.

J Vis. 2008 Feb 20;8(2):6.1-17. doi: 10.1167/8.2.6.

Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli.

J Vis. 2008 Feb 15;8(2):2.1-19. doi: 10.1167/8.2.2.

Scene classification using a hybrid generative/discriminative approach.

IEEE Trans Pattern Anal Mach Intell. 2008 Apr;30(4):712-27. doi: 10.1109/TPAMI.2007.70716.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

900个场景中的人员搜索建模：一种眼睛注视的组合源模型。

Modeling Search for People in 900 Scenes: A combined source model of eye guidance.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献