Department of Linguistics and Modern Languages, Chinese University of Hong Kong.
J Exp Psychol Gen. 2023 Aug;152(8):2359-2368. doi: 10.1037/xge0001409. Epub 2023 Jun 12.
Most words are low in frequency, yet a prevailing theory of word meaning (the distributional hypothesis: that words with similar meanings occur in similar contexts) and corresponding computational models struggle to represent low-frequency words. We conducted two preregistered experiments to test the hypothesis that similar-sounding words flesh out deficient semantic representations. In Experiment 1, native English speakers made semantic relatedness decisions about a cue (e.g., ) followed either by a target that overlaps in form and meaning with a higher frequency word (, which overlaps with ) or by a control (), matched on distributional and formal similarity to the cue. (Participants did not see higher frequency words like .) As predicted, participants decided faster and more often that overlapping targets, compared to controls, were semantically related to cues. In Experiment 2, participants read sentences containing the same cues and targets (e.g., ). We used MouseView.js to blur the sentences and create a fovea-like aperture directed by the participant's cursor, allowing us to approximate fixation duration. While we did not observe the predicted difference at the target region (e.g., ), we found a lag effect, with shorter fixations on words following overlapping targets, suggesting easier integration of those meanings. These experiments provide evidence that words with overlapping forms and meanings bolster representations of low-frequency words, which supports approaches to natural language processing that incorporate both formal and distributional information and which revises assumptions about how an optimal language will evolve. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
大多数单词的频率都较低,但一个占主导地位的词汇意义理论(分布假说:具有相似含义的单词出现在相似的上下文中)和相应的计算模型难以表示低频词。我们进行了两项预先注册的实验,以测试以下假设:相似发音的单词可以充实不足的语义表示。在实验 1 中,以英语为母语的参与者根据提示(例如)做出语义相关性判断,然后要么是与高频词(重叠的形式和含义)重叠的目标词(例如,与重叠),要么是控制词(),与提示在分布和形式上相似。(参与者看不到像这样的高频词。)正如预测的那样,与控制词相比,参与者更快、更频繁地决定重叠的目标词与提示词具有语义相关性。在实验 2 中,参与者阅读包含相同提示词和目标词的句子(例如)。我们使用 MouseView.js 模糊句子并创建一个由参与者光标引导的类焦点孔,使我们能够近似注视持续时间。虽然我们没有在目标区域(例如)观察到预期的差异,但我们发现了滞后效应,即跟随重叠目标的单词的注视时间更短,这表明更容易整合这些含义。这些实验提供了证据,表明具有重叠形式和含义的单词可以增强低频词的表示,这支持了自然语言处理的方法,这些方法既包含形式信息又包含分布信息,并修正了关于最佳语言将如何发展的假设。(PsycInfo 数据库记录(c)2023 APA,保留所有权利)。