Heitmeier Maria, Chuang Yu-Ying, Axen Seth D, Baayen R Harald
Quantitative Linguistics, University of Tübingen, Tübingen, Germany.
Cluster of Excellence Machine Learning: New Perspectives for Science, University of Tübingen, Tübingen, Germany.
Front Hum Neurosci. 2024 Jan 8;17:1242720. doi: 10.3389/fnhum.2023.1242720. eCollection 2023.
Word frequency is a strong predictor in most lexical processing tasks. Thus, any model of word recognition needs to account for how word frequency effects arise. The Discriminative Lexicon Model (DLM) models lexical processing with mappings between words' forms and their meanings. Comprehension and production are modeled via linear mappings between the two domains. So far, the mappings within the model can either be obtained incrementally via error-driven learning, a computationally expensive process able to capture frequency effects, or in an efficient, but frequency-agnostic solution modeling the theoretical endstate of learning (EL) where all words are learned optimally. In the present study we show how an efficient, yet frequency-informed mapping between form and meaning can be obtained (Frequency-informed learning; FIL). We find that FIL well approximates an incremental solution while being computationally much cheaper. FIL shows a relatively low type- and high token-accuracy, demonstrating that the model is able to process most word tokens encountered by speakers in daily life correctly. We use FIL to model reaction times in the Dutch Lexicon Project by means of a Gaussian Location Scale Model and find that FIL predicts well the S-shaped relationship between frequency and the mean of reaction times but underestimates the variance of reaction times for low frequency words. FIL is also better able to account for priming effects in an auditory lexical decision task in Mandarin Chinese, compared to EL. Finally, we used ordered data from CHILDES to compare mappings obtained with FIL and incremental learning. We show that the mappings are highly correlated, but that with FIL some nuances based on word ordering effects are lost. Our results show how frequency effects in a learning model can be simulated efficiently, and raise questions about how to best account for low-frequency words in cognitive models.
在大多数词汇加工任务中,词频是一个强有力的预测指标。因此,任何单词识别模型都需要解释词频效应是如何产生的。判别词典模型(DLM)通过单词形式与其意义之间的映射来模拟词汇加工。理解和生成是通过这两个领域之间的线性映射来建模的。到目前为止,模型中的映射既可以通过误差驱动学习逐步获得,这是一个计算成本高昂但能够捕捉频率效应的过程,也可以通过一种高效但与频率无关的解决方案来建模,该方案模拟学习的理论最终状态(EL),即所有单词都得到最优学习。在本研究中,我们展示了如何获得一种高效且包含频率信息的形式与意义之间的映射(频率信息学习;FIL)。我们发现FIL很好地近似于一种增量解决方案,同时计算成本要低得多。FIL显示出相对较低的类型准确率和较高的词元准确率,表明该模型能够正确处理说话者在日常生活中遇到的大多数词元。我们使用FIL通过高斯位置尺度模型对荷兰词汇项目中的反应时间进行建模,发现FIL能够很好地预测频率与反应时间均值之间的S形关系,但低估了低频词反应时间的方差。与EL相比,FIL在汉语听觉词汇判断任务中也更能解释启动效应。最后,我们使用来自儿童语言数据交换系统(CHILDES)的有序数据来比较通过FIL和增量学习获得的映射。我们表明这些映射高度相关,但使用FIL会丢失一些基于词序效应的细微差别。我们的结果展示了如何在学习模型中高效地模拟频率效应,并提出了关于如何在认知模型中最好地处理低频词的问题。