Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.
Department of Computer Science, Institute for Machine Learning, ETH Zürich, Zürich 8092, Schweiz.
Proc Natl Acad Sci U S A. 2024 Mar 5;121(10):e2307876121. doi: 10.1073/pnas.2307876121. Epub 2024 Feb 29.
During real-time language comprehension, our minds rapidly decode complex meanings from sequences of words. The difficulty of doing so is known to be related to words' contextual predictability, but what cognitive processes do these predictability effects reflect? In one view, predictability effects reflect facilitation due to anticipatory processing of words that are predictable from context. This view predicts a linear effect of predictability on processing demand. In another view, predictability effects reflect the costs of probabilistic inference over sentence interpretations. This view predicts either a logarithmic or a superlogarithmic effect of predictability on processing demand, depending on whether it assumes pressures toward a uniform distribution of information over time. The empirical record is currently mixed. Here, we revisit this question at scale: We analyze six reading datasets, estimate next-word probabilities with diverse statistical language models, and model reading times using recent advances in nonlinear regression. Results support a logarithmic effect of word predictability on processing difficulty, which favors probabilistic inference as a key component of human language processing.
在实时语言理解过程中,我们的大脑会迅速从单词序列中解码出复杂的含义。众所周知,这种理解的难度与单词的上下文可预测性有关,但这些可预测性效应反映了哪些认知过程呢?有一种观点认为,可预测性效应反映了由于对可以从上下文中预测到的单词进行预期处理而产生的促进作用。这种观点预测了可预测性对处理需求的线性影响。另一种观点认为,可预测性效应反映了对句子解释进行概率推理的成本。这种观点预测了可预测性对处理需求的对数或超对数效应,具体取决于它是否假设随着时间的推移,信息在均匀分布上的压力。目前,实证记录喜忧参半。在这里,我们从大规模上重新审视这个问题:我们分析了六个阅读数据集,使用不同的统计语言模型来估计下一个单词的概率,并使用非线性回归的最新进展来对阅读时间进行建模。结果支持单词可预测性对处理难度的对数效应,这有利于概率推理成为人类语言处理的关键组成部分。