Gupta Shashi Kant, Zhang Mengmi, Wu Chia-Chien, Wolfe Jeremy M, Kreiman Gabriel
Indian Institute of Technology Kanpur, India.
Children's Hospital, Harvard Medical School.
Adv Neural Inf Process Syst. 2021 Dec;34:6946-6959.
Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found. The model integrates eccentricity-dependent visual recognition with target-dependent top-down cues. We compared the model against human behavior in six paradigmatic search tasks that show asymmetry in humans. Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment. We tested this hypothesis by training the model on augmented versions of ImageNet where the biases of natural images were either removed or reversed. The polarity of search asymmetry disappeared or was altered depending on the training protocol. This study highlights how classical perceptual properties can emerge in neural network models, without the need for task-specific training, but rather as a consequence of the statistical properties of the developmental diet fed to the model. All source code and data are publicly available at https://github.com/kreimanlab/VisualSearchAsymmetry.
视觉搜索是一项普遍存在且常常具有挑战性的日常任务,比如在家中寻找汽车钥匙或在人群中寻找朋友。一些经典搜索任务的一个有趣特性是不对称性,即在干扰项B中找到目标A可能比在A中找到B更容易。为了阐明视觉搜索中不对称性的产生机制,我们提出了一个计算模型,该模型将目标和搜索图像作为输入,并生成一系列眼动,直到找到目标。该模型将依赖于偏心率的视觉识别与依赖于目标的自上而下线索相结合。我们将该模型与人类在六个显示出不对称性的典型搜索任务中的行为进行了比较。在没有事先接触刺激或特定任务训练的情况下,该模型为搜索不对称性提供了一种合理的机制。我们假设搜索不对称性的极性源于对自然环境的体验。我们通过在ImageNet的增强版本上训练模型来检验这一假设,在这些增强版本中,自然图像的偏差要么被消除,要么被反转。搜索不对称性的极性根据训练协议而消失或改变。这项研究突出了经典感知属性是如何在神经网络模型中出现的,无需特定任务训练,而是作为输入到模型中的发展性数据统计属性的结果。所有源代码和数据均可在https://github.com/kreimanlab/VisualSearchAsymmetry上公开获取。