Neuroscience Institute, Carnegie Mellon University, Pittsburgh, United States.
Department of Psychology, Emory University, Atlanta, United States.
Elife. 2022 May 25;11:e74943. doi: 10.7554/eLife.74943.
Categorization of everyday objects requires that humans form representations of shape that are tolerant to variations among exemplars. Yet, how such invariant shape representations develop remains poorly understood. By comparing human infants (6-12 months; N=82) to computational models of vision using comparable procedures, we shed light on the origins and mechanisms underlying object perception. Following habituation to a never-before-seen object, infants classified other novel objects across variations in their component parts. Comparisons to several computational models of vision, including models of high-level and low-level vision, revealed that infants' performance was best described by a model of shape based on the skeletal structure. Interestingly, infants outperformed a range of artificial neural network models, selected for their massive object experience and biological plausibility, under the same conditions. Altogether, these findings suggest that robust representations of shape can be formed with little language or object experience by relying on the perceptually invariant skeletal structure.
日常物体的分类要求人类形成对范例之间变化具有包容性的形状表示。然而,这种不变的形状表示是如何发展的仍然知之甚少。通过将人类婴儿(6-12 个月;N=82)与使用类似程序的视觉计算模型进行比较,我们揭示了物体感知的起源和机制。在对从未见过的物体进行习惯化后,婴儿会根据其组成部分的变化对其他新物体进行分类。与几种视觉计算模型(包括高层和低层视觉模型)的比较表明,基于骨骼结构的形状模型最能描述婴儿的表现。有趣的是,在相同条件下,婴儿的表现优于一系列经过选择的具有大量物体经验和生物合理性的人工神经网络模型。总的来说,这些发现表明,通过依赖于感知不变的骨骼结构,可以在没有语言或物体经验的情况下形成强大的形状表示。