Cognitive, Linguistic, and Psychological Sciences Department, Institute for Brain Sciences, Brown University Providence, RI, USA.
Front Psychol. 2011 Nov 15;2:326. doi: 10.3389/fpsyg.2011.00326. eCollection 2011.
Research progress in machine vision has been very significant in recent years. Robust face detection and identification algorithms are already readily available to consumers, and modern computer vision algorithms for generic object recognition are now coping with the richness and complexity of natural visual scenes. Unlike early vision models of object recognition that emphasized the role of figure-ground segmentation and spatial information between parts, recent successful approaches are based on the computation of loose collections of image features without prior segmentation or any explicit encoding of spatial relations. While these models remain simplistic models of visual processing, they suggest that, in principle, bottom-up activation of a loose collection of image features could support the rapid recognition of natural object categories and provide an initial coarse visual representation before more complex visual routines and attentional mechanisms take place. Focusing on biologically plausible computational models of (bottom-up) pre-attentive visual recognition, we review some of the key visual features that have been described in the literature. We discuss the consistency of these feature-based representations with classical theories from visual psychology and test their ability to account for human performance on a rapid object categorization task.
近年来,机器视觉领域的研究取得了重大进展。稳健的人脸检测和识别算法已经面向消费者推出,而用于通用目标识别的现代计算机视觉算法也正在应对自然视觉场景的丰富性和复杂性。与早期强调图形-背景分割和部分之间空间信息的目标识别视觉模型不同,最近成功的方法基于松散的图像特征集合的计算,而无需预先分割或任何显式的空间关系编码。虽然这些模型仍然是视觉处理的简化模型,但它们表明,原则上,松散的图像特征集合的自底向上激活可以支持自然目标类别的快速识别,并在更复杂的视觉例程和注意机制发生之前提供初始粗略的视觉表示。我们专注于(自底向上)非注意视觉识别的生物上合理的计算模型,回顾了文献中描述的一些关键视觉特征。我们讨论了这些基于特征的表示与视觉心理学经典理论的一致性,并测试了它们在快速目标分类任务中解释人类表现的能力。