Suppr超能文献

视觉搜索不对称性:深度神经网络与人类具有相似的内在偏差。

Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases.

作者信息

Gupta Shashi Kant, Zhang Mengmi, Wu Chia-Chien, Wolfe Jeremy M, Kreiman Gabriel

机构信息

Indian Institute of Technology Kanpur, India.

Children's Hospital, Harvard Medical School.

出版信息

Adv Neural Inf Process Syst. 2021 Dec;34:6946-6959.

Abstract

Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found. The model integrates eccentricity-dependent visual recognition with target-dependent top-down cues. We compared the model against human behavior in six paradigmatic search tasks that show asymmetry in humans. Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment. We tested this hypothesis by training the model on augmented versions of ImageNet where the biases of natural images were either removed or reversed. The polarity of search asymmetry disappeared or was altered depending on the training protocol. This study highlights how classical perceptual properties can emerge in neural network models, without the need for task-specific training, but rather as a consequence of the statistical properties of the developmental diet fed to the model. All source code and data are publicly available at https://github.com/kreimanlab/VisualSearchAsymmetry.

摘要

视觉搜索是一项普遍存在且常常具有挑战性的日常任务,比如在家中寻找汽车钥匙或在人群中寻找朋友。一些经典搜索任务的一个有趣特性是不对称性,即在干扰项B中找到目标A可能比在A中找到B更容易。为了阐明视觉搜索中不对称性的产生机制,我们提出了一个计算模型,该模型将目标和搜索图像作为输入,并生成一系列眼动,直到找到目标。该模型将依赖于偏心率的视觉识别与依赖于目标的自上而下线索相结合。我们将该模型与人类在六个显示出不对称性的典型搜索任务中的行为进行了比较。在没有事先接触刺激或特定任务训练的情况下,该模型为搜索不对称性提供了一种合理的机制。我们假设搜索不对称性的极性源于对自然环境的体验。我们通过在ImageNet的增强版本上训练模型来检验这一假设,在这些增强版本中,自然图像的偏差要么被消除,要么被反转。搜索不对称性的极性根据训练协议而消失或改变。这项研究突出了经典感知属性是如何在神经网络模型中出现的,无需特定任务训练,而是作为输入到模型中的发展性数据统计属性的结果。所有源代码和数据均可在https://github.com/kreimanlab/VisualSearchAsymmetry上公开获取。

相似文献

4
Visual search asymmetry with uncertain targets.具有不确定目标的视觉搜索不对称性。
J Exp Psychol Hum Percept Perform. 2005 Dec;31(6):1274-1287. doi: 10.1037/0096-1523.31.6.1274.
7
How important is lateral masking in visual search?侧向掩蔽在视觉搜索中有多重要?
Exp Brain Res. 2006 Apr;170(3):387-402. doi: 10.1007/s00221-005-0221-9. Epub 2005 Nov 23.

引用本文的文献

4
The canonical deep neural network as a model for human symmetry processing.作为人类对称性处理模型的典型深度神经网络。
iScience. 2024 Dec 5;28(1):111540. doi: 10.1016/j.isci.2024.111540. eCollection 2025 Jan 17.

本文引用的文献

1
Five Factors that Guide Attention in Visual Search.视觉搜索中引导注意力的五个因素。
Nat Hum Behav. 2017 Mar;1(3). doi: 10.1038/s41562-017-0058. Epub 2017 Mar 8.
3
Beyond the feedforward sweep: feedback computations in the visual cortex.超越前馈扫掠:视觉皮层中的反馈计算。
Ann N Y Acad Sci. 2020 Mar;1464(1):222-241. doi: 10.1111/nyas.14320. Epub 2020 Feb 28.
7
A Source for Feature-Based Attention in the Prefrontal Cortex.前额叶皮层中基于特征的注意力来源。
Neuron. 2015 Nov 18;88(4):832-44. doi: 10.1016/j.neuron.2015.10.001. Epub 2015 Nov 8.
9
Cortical magnification plus cortical plasticity equals vision?皮质放大率加上皮质可塑性等于视觉吗?
Vision Res. 2015 Jun;111(Pt B):161-9. doi: 10.1016/j.visres.2014.10.002. Epub 2014 Oct 16.
10
Statistical templates for visual search.视觉搜索的统计模板
J Vis. 2014 Mar 13;14(3):18. doi: 10.1167/14.3.18.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验