Jacobs Arthur M, Kinder Annette
Experimental and Neurocognitive Psychology Group, Department of Educational Science and Psychology, Freie Universität Berlin, Berlin, Germany.
Center for Cognitive Neuroscience Berlin (CCNB), Freie Universität Berlin, Berlin, Germany.
Front Artif Intell. 2022 Feb 22;5:718690. doi: 10.3389/frai.2022.718690. eCollection 2022.
Recent progress in machine-learning-based distributed semantic models (DSMs) offers new ways to simulate the (AM; Kintsch, 1980) of reader groups or individual readers and to predict their performance in reading-related tasks. The AM integrates the mental lexicon with world knowledge, as for example, acquired reading books. Following pioneering work by Denhière and Lemaire (2004), here, we computed DSMs based on a representative corpus of German children and youth literature (Jacobs et al., 2020) as null models of the part of the AM that represents distributional semantic input, for readers of different reading ages (grades 1-2, 3-4, and 5-6). After a series of DSM quality tests, we evaluated the performance of these models quantitatively in various tasks to simulate the different reader groups' hypothetical semantic and syntactic skills. In a final study, we compared the models' performance with that of human adult and children readers in two rating tasks. Overall, the results show that with increasing reading age performance in practically all tasks becomes better. The approach taken in these studies reveals the limits of DSMs for simulating human AM and their potential for applications in scientific studies of literature, research in education, or developmental science.
基于机器学习的分布式语义模型(DSMs)的最新进展为模拟读者群体或个体读者的(关联记忆模型;金茨克,1980)以及预测他们在阅读相关任务中的表现提供了新方法。关联记忆模型将心理词典与世界知识相结合,例如通过阅读书籍获得的知识。继丹希尔和勒梅尔(2004年)的开创性工作之后,在此,我们基于德国儿童和青少年文学的代表性语料库(雅各布斯等人,2020年)计算了分布式语义模型,作为关联记忆模型中代表分布式语义输入部分的零模型,用于不同阅读年龄(1 - 2年级、3 - 4年级和5 - 六年级)的读者。经过一系列分布式语义模型质量测试后,我们在各种任务中定量评估了这些模型的性能,以模拟不同读者群体假设的语义和句法技能。在最后一项研究中,我们在两项评分任务中将模型的性能与成年和儿童读者的性能进行了比较。总体而言,结果表明,随着阅读年龄的增长,几乎在所有任务中的表现都会更好。这些研究中采用的方法揭示了分布式语义模型在模拟人类关联记忆模型方面的局限性以及它们在文学科学研究、教育研究或发展科学中的应用潜力。