Agrawal Arpit, Agarwal Sumeet, Husain Samar
Indian Institute of Technology Delhi, India.
J Eye Mov Res. 2017 Apr 4;10(2). doi: 10.16910/jemr.10.2.4.
We used the Potsdam-Allahabad Hindi eye-tracking corpus to investigate the role of wordlevel and sentence-level factors during sentence comprehension in Hindi. Extending previous work that used this eye-tracking data, we investigate the role of surprisal and retrieval cost metrics during sentence processing. While controlling for word-level predictors (word complexity, syllable length, unigram and bigram frequencies) as well as sentence-level predictors such as integration and storage costs, we find a significant effect of surprisal on first-pass reading times (higher surprisal value leads to increase in FPRT). Effect of retrieval cost was only found for a higher degree of parser parallelism. Interestingly, while surprisal has a significant effect on FPRT, storage cost (another predictionbased metric) does not. A significant effect of storage cost shows up only in total fixation time (TFT), thus indicating that these two measures perhaps capture different aspects of prediction. The study replicates previous findings that both prediction-based and memorybased metrics are required to account for processing patterns during sentence comprehension. The results also show that parser model assumptions are critical in order to draw generalizations about the utility of a metric (e.g. surprisal) across various phenomena in a language.
我们使用了波茨坦 - 阿拉哈巴德印地语眼动追踪语料库,来研究印地语句子理解过程中单词层面和句子层面因素的作用。在之前使用该眼动追踪数据的工作基础上进行拓展,我们研究了意外性和检索成本指标在句子处理过程中的作用。在控制单词层面的预测因素(单词复杂度、音节长度、单字和双字频率)以及句子层面的预测因素(如整合和存储成本)的同时,我们发现意外性对首次阅读时间有显著影响(意外性值越高,首次阅读时间越长)。仅在更高程度的句法分析并行性情况下,才发现检索成本的影响。有趣的是,虽然意外性对首次阅读时间有显著影响,但存储成本(另一个基于预测的指标)却没有。存储成本的显著影响仅在总注视时间中出现,这表明这两个指标可能捕捉到了预测的不同方面。该研究重复了之前的发现,即基于预测和基于记忆的指标都需要用来解释句子理解过程中的处理模式。结果还表明,句法分析模型假设对于在一种语言的各种现象中对一个指标(如意外性)的效用进行概括至关重要。