Rogers Phillip G, Gries Stefan Th
Department of Linguistics, University of California, Santa Barbara, CA 93106, USA.
Department of English, Justus-Liebig University Giessen, 35390 Gießen, Germany.
Entropy (Basel). 2022 Apr 7;24(4):520. doi: 10.3390/e24040520.
Recent research into grammatical gender from the perspective of information theory has shown how seemingly arbitrary gender systems can ease processing demands by guiding lexical prediction. When the gender of a noun is revealed in a preceding element, the list of possible candidates is reduced to the nouns assigned to that gender. This strategy can be particularly effective if it eliminates words that are likely to compete for activation against the intended word. We propose syntax as the crucial context within which words must be disambiguated, hypothesizing that syntactically similar words should be less likely to share a gender cross-linguistically. We draw on recent work on syntactic information in the lexicon to define the syntactic distribution of a word as a probability vector of its participation in various dependency relations, and we extract such relations for 32 languages from the Universal Dependencies Treebanks. Correlational and mixed-effects regression analyses reveal that syntactically similar nouns are less likely to share a gender, the opposite pattern that is found for semantically and orthographically similar words. We interpret this finding as a design feature of language, and this study adds to a growing body of research attesting to the ways in which functional pressures on learning, memory, production, and perception shape the lexicon in different ways.
近期从信息论角度对语法性进行的研究表明,看似随意的性系统如何通过引导词汇预测来减轻处理负担。当前一个元素揭示了名词的性时,可能的候选词列表就会缩小到被指定为该性别的名词。如果这种策略能排除那些可能与目标词竞争激活的词,那么它会特别有效。我们提出句法是消除词的歧义的关键语境,并假设在跨语言中,句法相似的词应该不太可能共享性。我们借鉴近期关于词汇中句法信息的研究,将一个词的句法分布定义为其参与各种依存关系的概率向量,并从通用依存关系树库中提取32种语言的此类关系。相关分析和混合效应回归分析表明,句法相似的名词不太可能共享性,这与语义和正字法相似的词的情况相反。我们将这一发现解释为语言的一种设计特征,并且这项研究为越来越多的研究增添了内容,这些研究证明了学习、记忆、生成和感知方面的功能压力以不同方式塑造词汇的方式。