Cain Ellis S, Ryskin Rachel A, Yu Chen
Department of Cognitive and Information Sciences, University of California, Merced.
Department of Psychology, Center for Perceptual Systems, University of Texas, Austin.
Cogn Sci. 2025 Jun;49(6):e70078. doi: 10.1111/cogs.70078.
According to the cross-situational learning account, infants aggregate statistical information from multiple parent naming events to resolve ambiguous word-referent mappings within individual naming events. While previous experimental studies have shown that infant and adult learners can build correct mappings based on statistical regularities encoded in multiple learning situations in an experiment, other studies that use more naturalistic stimuli (e.g., real-world video) reveal poor performance in adults' ability to infer the correct referent. Based on those results derived from more naturalistic stimuli, the cross-situational learning solution cannot be useful to solve the mapping problem in the real world because cross-situational statistics from the real world are much more ambiguous than those created in experimental studies. To examine the feasibility of cross-situational learning in everyday contexts, the present study aims to quantify visual-audio statistics from one of everyday activities-parent-child toy play. We analyze parent naming events in a video corpus of infant-perspective scenes during parent-child toy play in a naturalistic lab setting, where we found three distinct properties that characterize statistical regularities perceived by young learners: (1) there are a limited number of visual scene compositions perceived by young learners at the moments when they hear object names; (2) the frequencies of parent naming events are distributed in a skewed, Zipfian fashion; and (3) cross-situational statistics in naturalistic toy play are comparable to those used in laboratory experiments. Our results underscore the importance of quantifying the statistical regularities in the input from the learner's perspective in order to shed light on the mechanisms supporting early word learning.
根据跨情境学习理论,婴儿会汇总多个父母命名事件中的统计信息,以解决单个命名事件中模糊的词-物映射问题。虽然之前的实验研究表明,婴儿和成人学习者可以根据实验中多个学习情境中编码的统计规律建立正确的映射,但其他使用更自然刺激(如真实世界视频)的研究显示,成人推断正确指称物的能力表现不佳。基于从更自然刺激中得出的这些结果,跨情境学习解决方案对于解决现实世界中的映射问题可能并无用处,因为来自现实世界的跨情境统计信息比实验研究中创建的统计信息更加模糊。为了检验跨情境学习在日常情境中的可行性,本研究旨在量化日常活动之一——亲子玩具玩耍中的视听统计信息。我们在自然主义实验室环境下,分析了亲子玩具玩耍期间婴儿视角场景视频语料库中的父母命名事件,发现了年轻学习者所感知的统计规律的三个不同特征:(1)年轻学习者在听到物体名称时所感知到的视觉场景构成数量有限;(2)父母命名事件的频率呈偏态的齐普夫分布;(3)自然主义玩具玩耍中的跨情境统计信息与实验室实验中使用的统计信息相当。我们的结果强调了从学习者角度量化输入中的统计规律的重要性,以便阐明支持早期词汇学习的机制。