Neurophysics, Philipps University Marburg, Marburg, Germany.
PLoS One. 2011;6(10):e25373. doi: 10.1371/journal.pone.0025373. Epub 2011 Oct 5.
The human visual system seems to be particularly efficient at detecting faces. This efficiency sometimes comes at the cost of wrongfully seeing faces in arbitrary patterns, including famous examples such as a rock configuration on Mars or a toast's roast patterns. In machine vision, face detection has made considerable progress and has become a standard feature of many digital cameras. The arguably most wide-spread algorithm for such applications ("Viola-Jones" algorithm) achieves high detection rates at high computational efficiency. To what extent do the patterns that the algorithm mistakenly classifies as faces also fool humans? We selected three kinds of stimuli from real-life, first-person perspective movies based on the algorithm's output: correct detections ("real faces"), false positives ("illusory faces") and correctly rejected locations ("non faces"). Observers were shown pairs of these for 20 ms and had to direct their gaze to the location of the face. We found that illusory faces were mistaken for faces more frequently than non faces. In addition, rotation of the real face yielded more errors, while rotation of the illusory face yielded fewer errors. Using colored stimuli increases overall performance, but does not change the pattern of results. When replacing the eye movement by a manual response, however, the preference for illusory faces over non faces disappeared. Taken together, our data show that humans make similar face-detection errors as the Viola-Jones algorithm, when directing their gaze to briefly presented stimuli. In particular, the relative spatial arrangement of oriented filters seems of relevance. This suggests that efficient face detection in humans is likely to be pre-attentive and based on rather simple features as those encoded in the early visual system.
人类视觉系统似乎特别擅长检测人脸。这种效率有时是以错误地将人脸识别到任意模式为代价的,包括著名的例子,如火星上的岩石形状或烤面包的烤痕图案。在机器视觉中,人脸检测已经取得了相当大的进展,并且已经成为许多数码相机的标准功能。这种应用中最广泛的算法(“Viola-Jones”算法)在高计算效率下实现了高检测率。该算法错误分类为人脸的模式在多大程度上也会欺骗人类?我们根据算法的输出从现实生活中的第一人称视角电影中选择了三种刺激:正确的检测(“真实人脸”)、错误的阳性(“错觉人脸”)和正确拒绝的位置(“非人脸”)。观察者被展示了这些刺激的对,每个对持续 20 毫秒,然后要求他们将目光指向人脸的位置。我们发现错觉人脸被误认为是人脸的频率高于非人脸。此外,真实人脸的旋转会产生更多的错误,而错觉人脸的旋转则会产生较少的错误。使用彩色刺激会提高整体性能,但不会改变结果的模式。然而,当用手动反应代替眼动时,错觉人脸相对于非人脸的偏好就消失了。总的来说,我们的数据表明,当人类将目光指向短暂呈现的刺激时,他们会像“Viola-Jones”算法一样做出类似的人脸检测错误。特别是,定向滤波器的相对空间排列似乎很重要。这表明人类的高效人脸检测可能是前注意的,并且基于早期视觉系统中编码的简单特征。