Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany.
Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany.
PLoS Comput Biol. 2023 Oct 26;19(10):e1011512. doi: 10.1371/journal.pcbi.1011512. eCollection 2023 Oct.
The complexity of natural scenes makes it challenging to experimentally study the mechanisms behind human gaze behavior when viewing dynamic environments. Historically, eye movements were believed to be driven primarily by space-based attention towards locations with salient features. Increasing evidence suggests, however, that visual attention does not select locations with high saliency but operates on attentional units given by the objects in the scene. We present a new computational framework to investigate the importance of objects for attentional guidance. This framework is designed to simulate realistic scanpaths for dynamic real-world scenes, including saccade timing and smooth pursuit behavior. Individual model components are based on psychophysically uncovered mechanisms of visual attention and saccadic decision-making. All mechanisms are implemented in a modular fashion with a small number of well-interpretable parameters. To systematically analyze the importance of objects in guiding gaze behavior, we implemented five different models within this framework: two purely spatial models, where one is based on low-level saliency and one on high-level saliency, two object-based models, with one incorporating low-level saliency for each object and the other one not using any saliency information, and a mixed model with object-based attention and selection but space-based inhibition of return. We optimized each model's parameters to reproduce the saccade amplitude and fixation duration distributions of human scanpaths using evolutionary algorithms. We compared model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and returning to objects. A model with object-based attention and inhibition, which uses saliency information to prioritize between objects for saccadic selection, leads to scanpath statistics with the highest similarity to the human data. This demonstrates that scanpath models benefit from object-based attention and selection, suggesting that object-level attentional units play an important role in guiding attentional processing.
自然场景的复杂性使得在观看动态环境时,通过实验研究人类注视行为背后的机制变得具有挑战性。历史上,人们认为眼球运动主要是由基于空间的注意力驱动的,朝向具有显著特征的位置。然而,越来越多的证据表明,视觉注意力不是选择具有高显著度的位置,而是作用于场景中物体所赋予的注意力单元上。我们提出了一种新的计算框架来研究物体对注意力引导的重要性。该框架旨在模拟动态真实场景的逼真扫视路径,包括扫视定时和平滑追踪行为。单个模型组件基于视觉注意力和扫视决策的心理物理学揭示的机制。所有机制都以模块化方式实现,具有少量可解释的参数。为了系统地分析物体在引导注视行为中的重要性,我们在这个框架内实现了五个不同的模型:两个纯粹基于空间的模型,其中一个基于低水平显著度,另一个基于高水平显著度;两个基于物体的模型,一个为每个物体都纳入了低水平显著度,另一个则不使用任何显著度信息;以及一个具有基于物体的注意力和选择但基于空间的返回抑制的混合模型。我们使用进化算法优化了每个模型的参数,以再现人类扫视路径的扫视幅度和注视持续时间分布。我们比较了模型在空间和时间注视行为方面的性能,包括探索背景的注视比例,以及检测、检查和返回物体的注视。具有基于物体的注意力和抑制的模型,它使用显著度信息为扫视选择在物体之间进行优先级排序,导致与人类数据最相似的扫视路径统计数据。这表明扫视路径模型受益于基于物体的注意力和选择,这表明物体级别的注意力单元在引导注意力处理方面起着重要作用。