University at Buffalo, Buffalo, NY, USA.
University of Manitoba, Winnipeg, Manitoba, Canada.
Behav Res Methods. 2019 Dec;51(6):2438-2453. doi: 10.3758/s13428-019-01289-z.
We measured and documented the influence of corpus effects on lexical behavior. Specifically, we used a corpus of over 26,000 fiction books to show that computational models of language trained on samples of language (i.e., subcorpora) representative of the language located in a particular place and time can track differences in people's experimental language behavior. This conclusion was true across multiple tasks (lexical decision, category production, and word familiarity) and provided insight into the influence that language experience imposes on language processing and organization. We used the assembled corpus and methods to validate a new machine-learning approach for optimizing language models, entitled experiential optimization (Johns, Jones, & Mewhort in Psychonomic Bulletin & Review, 26, 103-126, 2019).
我们测量并记录了语料库效应对词汇行为的影响。具体来说,我们使用了一个包含超过 26000 本小说的语料库,表明基于语言样本(即子语料库)训练的语言计算模型语言位于特定地点和时间,可以跟踪人们实验语言行为的差异。这一结论在多项任务(词汇判断、类别生成和词汇熟悉度)中都是正确的,并深入了解了语言经验对语言处理和组织的影响。我们使用组装好的语料库和方法来验证一种新的用于优化语言模型的机器学习方法,即经验优化(Johns、Jones 和 Mewhort 在《心理学期刊与评论》,26,103-126,2019 年)。