Demberg Vera, Keller Frank
School of Informatics, University of Edinburgh, Edinburgh, UK.
Cognition. 2008 Nov;109(2):193-210. doi: 10.1016/j.cognition.2008.07.008. Epub 2008 Oct 18.
We evaluate the predictions of two theories of syntactic processing complexity, dependency locality theory (DLT) and surprisal, against the Dundee Corpus, which contains the eye-tracking record of 10 participants reading 51,000 words of newspaper text. Our results show that DLT integration cost is not a significant predictor of reading times for arbitrary words in the corpus. However, DLT successfully predicts reading times for nouns. We also find evidence for integration cost effects at auxiliaries, not predicted by DLT. For surprisal, we demonstrate that an unlexicalized formulation of surprisal can predict reading times for arbitrary words in the corpus. Comparing DLT integration cost and surprisal, we find that the two measures are uncorrelated, which suggests that a complete theory will need to incorporate both aspects of processing complexity. We conclude that eye-tracking corpora, which provide reading time data for naturally occurring, contextualized sentences, can complement experimental evidence as a basis for theories of processing complexity.
我们对照邓迪语料库评估了句法处理复杂性的两种理论——依存局部性理论(DLT)和意外值——的预测,该语料库包含10名参与者阅读51000个报纸文本单词的眼动追踪记录。我们的结果表明,DLT整合成本并非语料库中任意单词阅读时间的显著预测指标。然而,DLT成功预测了名词的阅读时间。我们还发现了助词处存在整合成本效应的证据,这是DLT未预测到的。对于意外值,我们证明了一种未词汇化的意外值表述能够预测语料库中任意单词的阅读时间。比较DLT整合成本和意外值,我们发现这两种测量方法不相关,这表明完整的理论需要将处理复杂性的两个方面都纳入进来。我们得出结论,眼动追踪语料库能为自然出现的、语境化的句子提供阅读时间数据,可作为处理复杂性理论基础的补充实验证据。