Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.
School of Psychology, University of Sydney, Sydney, Australia.
Nat Hum Behav. 2021 Oct;5(10):1402-1417. doi: 10.1038/s41562-021-01097-6. Epub 2021 May 6.
Reflectance, lighting and geometry combine in complex ways to create images. How do we disentangle these to perceive individual properties, such as surface glossiness? We suggest that brains disentangle properties by learning to model statistical structure in proximal images. To test this hypothesis, we trained unsupervised generative neural networks on renderings of glossy surfaces and compared their representations with human gloss judgements. The networks spontaneously cluster images according to distal properties such as reflectance and illumination, despite receiving no explicit information about these properties. Intriguingly, the resulting representations also predict the specific patterns of 'successes' and 'errors' in human perception. Linearly decoding specular reflectance from the model's internal code predicts human gloss perception better than ground truth, supervised networks or control models, and it predicts, on an image-by-image basis, illusions of gloss perception caused by interactions between material, shape and lighting. Unsupervised learning may underlie many perceptual dimensions in vision and beyond.
反射率、光照和几何形状以复杂的方式结合在一起,形成了图像。我们如何将这些因素分开,以感知单个属性,例如表面光泽度?我们认为,大脑通过学习对近端图像中的统计结构进行建模来分离属性。为了验证这一假设,我们在有光泽表面的渲染图像上训练了无监督生成式神经网络,并将它们的表示与人类的光泽判断进行了比较。尽管这些网络没有接收到关于这些属性的明确信息,但它们会根据反射率和照明等远端属性自发地对图像进行聚类。有趣的是,由此产生的表示还可以预测人类感知中的具体“成功”和“错误”模式。从模型的内部代码中线性解码镜面反射率,比真实情况、监督网络或控制模型更能预测人类的光泽感知,并且可以基于每幅图像的基础,预测出由于材料、形状和照明之间的相互作用而产生的光泽感知错觉。无监督学习可能是视觉和其他领域中许多感知维度的基础。