Department of Cognitive Science and Artificial Intelligence, Tilburg University.
Bocconi Institute for Data Science and Analytics, Bocconi University.
Cogn Sci. 2021 Apr;45(4):e12963. doi: 10.1111/cogs.12963.
In this study, we use temporally aligned word embeddings and a large diachronic corpus of English to quantify language change in a data-driven, scalable way, which is grounded in language use. We show a unique and reliable relation between measures of language change and age of acquisition (AoA) while controlling for frequency, contextual diversity, concreteness, length, dominant part of speech, orthographic neighborhood density, and diachronic frequency variation. We analyze measures of language change tackling both the change in lexical representations and the change in the relation between lexical representations and the words with the most similar usage patterns, showing that they capture different aspects of language change. Our results show a unique relation between language change and AoA, which is stronger when considering neighborhood-level measures of language change: Words with more coherent diachronic usage patterns tend to be acquired earlier. The results support theories positing a link between ontogenetic and ethnogenetic processes in language.
在这项研究中,我们使用时间对齐的词嵌入和一个大型的英语历时语料库,以数据驱动、可扩展的方式量化语言变化,这种方式基于语言使用。我们展示了语言变化度量与习得年龄(AoA)之间的独特而可靠的关系,同时控制了频率、上下文多样性、具体性、长度、主要词性、正字法邻域密度和历时频率变化。我们分析了语言变化的度量,既解决了词汇表示的变化,也解决了词汇表示与使用模式最相似的词汇之间的关系的变化,表明它们捕捉到了语言变化的不同方面。我们的结果显示了语言变化和 AoA 之间的独特关系,当考虑到语言变化的邻域级度量时,这种关系更强:具有更连贯的历时使用模式的词汇往往更早被习得。这些结果支持了语言中个体发生和群体发生过程之间存在联系的理论。