Suppr超能文献

评估自然阅读句子中单词预测的信息论测度。

Evaluating information-theoretic measures of word prediction in naturalistic sentence reading.

机构信息

Department of Language Science and Technology, Saarland University, Saarbrücken, Germany.

Centre for Language Studies, Radboud University, Nijmegen, the Netherlands.

出版信息

Neuropsychologia. 2019 Nov;134:107198. doi: 10.1016/j.neuropsychologia.2019.107198. Epub 2019 Sep 22.

Abstract

We review information-theoretic measures of cognitive load during sentence processing that have been used to quantify word prediction effort. Two such measures, surprisal and next-word entropy, suffer from shortcomings when employed for a predictive processing view. We propose a novel metric, lookahead information gain, that can overcome these short-comings. We estimate the different measures using probabilistic language models. Subsequently, we put them to the test by analysing how well the estimated measures predict human processing effort in three data sets of naturalistic sentence reading. Our results replicate the well known effect of surprisal on word reading effort, but do not indicate a role of next-word entropy or lookahead information gain. Our computational results suggest that, in a predictive processing system, the costs of predicting may outweigh the gains. This idea poses a potential limit to the value of a predictive mechanism for the processing of language. The result illustrates the unresolved problem of finding estimations of word-by-word prediction that, first, are truly independent of perceptual processing of the to-be-predicted words, second, are statistically reliable predictors of experimental data, and third, can be derived from more general assumptions about the cognitive processes involved.

摘要

我们回顾了在句子处理过程中用于量化单词预测难度的信息论认知负荷度量方法。这两个度量方法, surprisal 和 next-word entropy,在用于预测加工的观点时存在一些缺陷。我们提出了一种新的度量方法,前瞻性信息增益,它可以克服这些缺陷。我们使用概率语言模型来估计不同的度量方法。然后,我们通过分析在三个自然语言句子阅读的数据集上,这些估计的度量方法在多大程度上预测了人类处理难度,来检验这些度量方法。我们的结果复制了 surprisal 对单词阅读难度的显著影响,但没有表明 next-word entropy 或前瞻性信息增益的作用。我们的计算结果表明,在预测加工系统中,预测的成本可能超过收益。这个观点对预测机制在语言处理中的价值构成了潜在的限制。该结果说明了一个未解决的问题,即找到真正独立于预测单词的感知处理的、在统计上可靠的实验数据预测因子,并能够从涉及的认知过程的更一般假设中推导出这些预测因子。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验