Universidad de Zaragoza, I3A, Zaragoza, Spain.
Universidad de Zaragoza, I3A, Max Planck Institute for Informatics, Zaragoza, Spain.
J Vis. 2021 Feb 3;21(2):2. doi: 10.1167/jov.21.2.2.
Observing and recognizing materials is a fundamental part of our daily life. Under typical viewing conditions, we are capable of effortlessly identifying the objects that surround us and recognizing the materials they are made of. Nevertheless, understanding the underlying perceptual processes that take place to accurately discern the visual properties of an object is a long-standing problem. In this work, we perform a comprehensive and systematic analysis of how the interplay of geometry, illumination, and their spatial frequencies affects human performance on material recognition tasks. We carry out large-scale behavioral experiments where participants are asked to recognize different reference materials among a pool of candidate samples. In the different experiments, we carefully sample the information in the frequency domain of the stimuli. From our analysis, we find significant first-order interactions between the geometry and the illumination, of both the reference and the candidates. In addition, we observe that simple image statistics and higher-order image histograms do not correlate with human performance. Therefore, we perform a high-level comparison of highly nonlinear statistics by training a deep neural network on material recognition tasks. Our results show that such models can accurately classify materials, which suggests that they are capable of defining a meaningful representation of material appearance from labeled proximal image data. Last, we find preliminary evidence that these highly nonlinear models and humans may use similar high-level factors for material recognition tasks.
观察和识别材料是我们日常生活的基本组成部分。在典型的观察条件下,我们能够毫不费力地识别周围的物体,并识别它们的材料。然而,理解发生的潜在感知过程,以准确辨别物体的视觉属性是一个长期存在的问题。在这项工作中,我们对几何形状、照明及其空间频率的相互作用如何影响人类在材料识别任务中的表现进行了全面和系统的分析。我们进行了大规模的行为实验,要求参与者在一组候选样本中识别不同的参考材料。在不同的实验中,我们仔细地在刺激的频率域中采样信息。从我们的分析中,我们发现参考样本和候选样本的几何形状和照明之间存在显著的一阶相互作用。此外,我们观察到简单的图像统计和高阶图像直方图与人类表现没有相关性。因此,我们通过在材料识别任务上训练深度神经网络来进行高级别的非线性统计比较。我们的结果表明,这些模型可以准确地对材料进行分类,这表明它们能够从标记的近景图像数据中定义材料外观的有意义表示。最后,我们发现初步证据表明,这些高度非线性的模型和人类可能在材料识别任务中使用类似的高级因素。