Lazaridou Angeliki, Marelli Marco, Baroni Marco
Center for Mind/Brain Sciences, University of Trento.
Department of Experimental Psychology, Ghent University.
Cogn Sci. 2017 Apr;41 Suppl 4:677-705. doi: 10.1111/cogs.12481. Epub 2017 Mar 21.
By the time they reach early adulthood, English speakers are familiar with the meaning of thousands of words. In the last decades, computational simulations known as distributional semantic models (DSMs) have demonstrated that it is possible to induce word meaning representations solely from word co-occurrence statistics extracted from a large amount of text. However, while these models learn in batch mode from large corpora, human word learning proceeds incrementally after minimal exposure to new words. In this study, we run a set of experiments investigating whether minimal distributional evidence from very short passages suffices to trigger successful word learning in subjects, testing their linguistic and visual intuitions about the concepts associated with new words. After confirming that subjects are indeed very efficient distributional learners even from small amounts of evidence, we test a DSM on the same multimodal task, finding that it behaves in a remarkable human-like way. We conclude that DSMs provide a convincing computational account of word learning even at the early stages in which a word is first encountered, and the way they build meaning representations can offer new insights into human language acquisition.
到他们进入成年早期时,说英语的人已经熟悉了数千个单词的含义。在过去几十年中,被称为分布语义模型(DSM)的计算模拟表明,仅从大量文本中提取的单词共现统计数据就有可能诱导出单词意义表征。然而,虽然这些模型以批处理模式从大型语料库中学习,但人类的单词学习在接触新单词的时间很少之后就会逐步进行。在本研究中,我们进行了一组实验,调查来自非常短的段落的最少分布证据是否足以触发受试者成功的单词学习,测试他们对与新单词相关概念的语言和视觉直觉。在确认受试者即使从少量证据中也确实是非常高效的分布学习者之后,我们在相同的多模态任务上测试了一个DSM,发现它的行为方式与人类非常相似。我们得出结论,DSM即使在首次遇到单词的早期阶段也能为单词学习提供令人信服的计算解释,并且它们构建意义表征的方式可以为人类语言习得提供新的见解。