Futrell Richard, Mahowald Kyle, Gibson Edward
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.
Proc Natl Acad Sci U S A. 2015 Aug 18;112(33):10336-41. doi: 10.1073/pnas.1502134112. Epub 2015 Aug 3.
Explaining the variation between human languages and the constraints on that variation is a core goal of linguistics. In the last 20 y, it has been claimed that many striking universals of cross-linguistic variation follow from a hypothetical principle that dependency length--the distance between syntactically related words in a sentence--is minimized. Various models of human sentence production and comprehension predict that long dependencies are difficult or inefficient to process; minimizing dependency length thus enables effective communication without incurring processing difficulty. However, despite widespread application of this idea in theoretical, empirical, and practical work, there is not yet large-scale evidence that dependency length is actually minimized in real utterances across many languages; previous work has focused either on a small number of languages or on limited kinds of data about each language. Here, using parsed corpora of 37 diverse languages, we show that overall dependency lengths for all languages are shorter than conservative random baselines. The results strongly suggest that dependency length minimization is a universal quantitative property of human languages and support explanations of linguistic variation in terms of general properties of human information processing.
解释人类语言之间的差异以及这种差异所受到的限制是语言学的核心目标。在过去20年里,有人声称许多引人注目的跨语言变异共性源自一个假设原则,即依存长度(句子中句法相关词语之间的距离)被最小化。各种人类句子生成和理解模型预测,长依存关系难以处理或处理效率低下;因此,最小化依存长度能够实现有效的交流,而不会带来处理困难。然而,尽管这一观点在理论、实证和实际工作中得到了广泛应用,但尚无大规模证据表明在多种语言的实际话语中依存长度实际上被最小化了;先前的研究要么集中在少数几种语言上,要么集中在每种语言的有限类型的数据上。在这里,我们使用37种不同语言的已解析语料库表明,所有语言的总体依存长度都短于保守的随机基线。结果有力地表明,依存长度最小化是人类语言的一种普遍量化属性,并支持从人类信息处理的一般属性角度对语言变异进行解释。