Johns Brendan T
Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY, United States.
Front Psychol. 2019 Feb 18;10:268. doi: 10.3389/fpsyg.2019.00268. eCollection 2019.
Big data approaches to psychology have become increasing popular (Jones, 2017). Two of the main developments of this line of research is the advent of distributional models of semantics (e.g., Landauer and Dumais, 1997), which learn the meaning of words from large text corpora, and the collection of mega datasets of human behavior (e.g., The English lexicon project; Balota et al., 2007). The current article combines these two approaches, with the goal being to understand the consistency and preference that people have for word meanings. This was accomplished by mining a large amount of data from an online, crowdsourced dictionary and analyzing this data with a distributional model. Overall, it was found that even for words that are not an active part of the language environment, there is a large amount of consistency in the word meanings that different people have. Additionally, it was demonstrated that users of a language have strong preferences for word meanings, such that definitions to words that do not conform to people's conceptions are rejected by a community of language users. The results of this article provides insights into the cultural evolution of word meanings, and sheds light on alternative methodologies that can be used to understand lexical behavior.
心理学中的大数据方法越来越受欢迎(琼斯,2017)。这一研究方向的两个主要进展是语义分布模型的出现(例如,兰道尔和杜梅斯,1997),该模型从大型文本语料库中学习单词的含义,以及人类行为海量数据集的收集(例如,英语词汇项目;巴洛塔等人,2007)。本文结合了这两种方法,目标是了解人们对词义的一致性和偏好。这是通过从一个在线众包词典中挖掘大量数据并用分布模型分析这些数据来实现的。总体而言,研究发现,即使对于那些并非语言环境中常用的单词,不同人所理解的词义也存在大量一致性。此外,研究表明,一种语言的使用者对词义有强烈的偏好,以至于不符合人们概念的单词定义会被语言使用者群体拒绝。本文的研究结果为词义的文化演变提供了见解,并揭示了可用于理解词汇行为的其他方法。