Department of Psychology, Lancaster University, Lancaster, LA1 4YF, UK.
Institute of Linguistics, Academia Sinica, Taipei, Taiwan.
Behav Res Methods. 2018 Dec;50(6):2292-2304. doi: 10.3758/s13428-017-0993-4.
Words are considered semantically ambiguous if they have more than one meaning and can be used in multiple contexts. A number of recent studies have provided objective ambiguity measures by using a corpus-based approach and have demonstrated ambiguity advantages in both naming and lexical decision tasks. Although the predictive power of objective ambiguity measures has been examined in several alphabetic language systems, the effects in logographic languages remain unclear. Moreover, most ambiguity measures do not explicitly address how the various contexts associated with a given word relate to each other. To explore these issues, we computed the contextual diversity (Adelman, Brown, & Quesada, Psychological Science, 17; 814-823, 2006) and semantic ambiguity (Hoffman, Lambon Ralph, & Rogers, Behavior Research Methods, 45; 718-730, 2013) of traditional Chinese single-character words based on the Academia Sinica Balanced Corpus, where contextual diversity was used to evaluate the present semantic space. We then derived a novel ambiguity measure, namely semantic variability, by computing the distance properties of the distinct clusters grouped by the contexts that contained a given word. We demonstrated that semantic variability was superior to semantic diversity in accounting for the variance in naming response times, suggesting that considering the substructure of the various contexts associated with a given word can provide a relatively fine scale of ambiguity information for a word. All of the context and ambiguity measures for 2,418 Chinese single-character words are provided as supplementary materials.
如果一个词有多个含义并且可以在多个上下文中使用,那么这个词就被认为是语义上模糊的。最近的一些研究通过基于语料库的方法提供了客观的歧义度量,并在命名和词汇决策任务中展示了歧义优势。尽管客观歧义度量的预测能力已经在几种字母语言系统中得到了检验,但在表意文字语言中的效果仍不清楚。此外,大多数歧义度量并没有明确说明与给定单词相关的各种上下文之间是如何相互关联的。为了探讨这些问题,我们根据《中国科学院平衡语料库》计算了传统汉字单字的语境多样性(Adelman、Brown 和 Quesada,《心理科学》,17;814-823,2006)和语义歧义(Hoffman、Lambon Ralph 和 Rogers,《行为研究方法》,45;718-730,2013),其中语境多样性用于评估当前的语义空间。然后,我们通过计算由包含给定单词的上下文分组的不同聚类的距离特性,得出了一种新的歧义度量,即语义可变性。我们证明,语义可变性在解释命名反应时间的方差方面优于语境多样性,这表明考虑与给定单词相关的各种上下文的子结构可以为单词提供相对精细的歧义信息尺度。所有 2418 个汉字单字的上下文和歧义度量都作为补充材料提供。