Department of Brain & Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA.
Cogn Psychol. 2013 Feb;66(1):30-54. doi: 10.1016/j.cogpsych.2012.09.001. Epub 2012 Oct 23.
A fundamental component of language acquisition involves organizing words into grammatical categories. Previous literature has suggested a number of ways in which this categorization task might be accomplished. Here we ask whether the patterning of the words in a corpus of linguistic input (distributional information) is sufficient, along with a small set of learning biases, to extract these underlying structural categories. In a series of experiments, we show that learners can acquire linguistic form-classes, generalizing from instances of the distributional contexts of individual words in the exposure set to the full range of contexts for all the words in the set. Crucially, we explore how several specific distributional variables enable learners to form a category of lexical items and generalize to novel words, yet also allow for exceptions that maintain lexical specificity. We suggest that learners are sensitive to the contexts of individual words, the overlaps among contexts across words, the non-overlap of contexts (or systematic gaps in information), and the size of the exposure set. We also ask how learners determine the category membership of a new word for which there is very sparse contextual information. We find that, when there are strong category cues and robust category learning of other words, adults readily generalize the distributional properties of the learned category to a new word that shares just one context with the other category members. However, as the distributional cues regarding the category become sparser and contain more consistent gaps, learners show more conservatism in generalizing distributional properties to the novel word. Taken together, these results show that learners are highly systematic in their use of the distributional properties of the input corpus, using them in a principled way to determine when to generalize and when to preserve lexical specificity.
语言习得的一个基本组成部分涉及将单词组织到语法类别中。先前的文献提出了许多完成这种分类任务的方法。在这里,我们想知道语言输入语料库中单词的模式(分布信息)是否足以与少量学习偏差一起提取这些潜在的结构类别。在一系列实验中,我们表明学习者可以习得语言形式类别,从暴露集中单个单词的分布上下文实例中进行概括,以涵盖集中所有单词的全部上下文范围。至关重要的是,我们探讨了几个特定的分布变量如何使学习者能够形成词汇项的类别并推广到新单词,同时也允许保持词汇特异性的例外。我们认为学习者对单个单词的上下文、单词之间上下文的重叠、上下文的非重叠(或信息的系统间隙)以及暴露集的大小都很敏感。我们还询问学习者如何确定新单词的类别成员,而这些新单词的上下文信息非常稀疏。我们发现,当存在强烈的类别提示和对其他单词的强大类别学习时,成年人会很容易地将学习到的类别的分布特性推广到与其他类别成员仅共享一个上下文的新单词上。然而,随着关于类别的分布线索变得更加稀疏并且包含更多一致的间隙,学习者在将分布特性推广到新单词时表现出更大的保守性。综上所述,这些结果表明学习者在使用输入语料库的分布特性方面非常系统,他们以一种有原则的方式使用这些特性来确定何时进行概括以及何时保留词汇特异性。