Computational Linguistics, Saarland University School of Informatics, University of Edinburgh.
Cogn Sci. 2009 Jul;33(5):794-838. doi: 10.1111/j.1551-6709.2009.01033.x. Epub 2009 Apr 17.
Experimental research shows that human sentence processing uses information from different levels of linguistic analysis, for example, lexical and syntactic preferences as well as semantic plausibility. Existing computational models of human sentence processing, however, have focused primarily on lexico-syntactic factors. Those models that do account for semantic plausibility effects lack a general model of human plausibility intuitions at the sentence level. Within a probabilistic framework, we propose a wide-coverage model that both assigns thematic roles to verb-argument pairs and determines a preferred interpretation by evaluating the plausibility of the resulting (verb, role, argument) triples. The model is trained on a corpus of role-annotated language data. We also present a transparent integration of the semantic model with an incremental probabilistic parser. We demonstrate that both the semantic plausibility model and the combined syntax/semantics model predict judgment and reading time data from the experimental literature.
实验研究表明,人类句子处理会利用来自不同语言分析层面的信息,例如词汇和句法偏好以及语义合理性。然而,现有的人类句子处理计算模型主要关注词汇和句法因素。那些确实考虑语义合理性效应的模型缺乏一个关于句子层面人类合理性直觉的通用模型。在概率框架内,我们提出了一个广泛覆盖的模型,该模型不仅可以给动词-论元对分配主题角色,还可以通过评估生成的(动词、角色、论元)三元组的合理性来确定首选解释。该模型是在一个带有角色标注的语言数据语料库上进行训练的。我们还展示了语义模型与增量概率解析器的透明集成。我们证明,语义合理性模型和组合的句法/语义模型都可以预测实验文献中的判断和阅读时间数据。