Department of Computer Science, University of Massachusetts Boston, Boston, MA, USA,
Behav Res Methods. 2014 Mar;46(1):284-306. doi: 10.3758/s13428-013-0360-z.
The morphological constituents of English compounds (e.g., "butter" and "fly" for "butterfly") and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make judgments difficult, and a computational alternative based on a general model might be a way to average across subjective differences. In the present study, we propose two approaches based on latent semantic analysis (Landauer & Dumais in Psychological Review 104:211-240, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent from a clustering analysis of morphological family members (e.g., "butterfingers" or "buttermilk" for "butter"). The proposed models successfully predicted participants' transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, on the basis of differences in raters' morphological processing in the different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed.
英文复合词(例如“butter”和“fly”表示“butterfly”)和两字词汉语复合词的形态组成部分在语义上可能与整个词不同。主观差异和透明度的模糊性使得判断变得困难,而基于一般模型的计算替代方法可能是平均主观差异的一种方式。在本研究中,我们提出了两种基于潜在语义分析的方法(Landauer & Dumais 在 Psychological Review 104:211-240, 1997):模型 1 比较复合词与其组成部分之间的语义相似性,模型 2 从形态家族成员的聚类分析中得出组成部分的主要含义(例如,“butterfingers”或“buttermilk”表示“butter”)。所提出的模型成功地预测了参与者的透明度评分,我们建议实验者根据不同书写系统中评分者的形态处理差异,对英语复合词使用模型 1,对汉语复合词使用模型 2。还提供了词汇意义的主导性、语义透明度以及形态家族中所有对之间的平均相似性,并讨论了未来研究的实际应用。