Suppr超能文献

37种语言中依存长度最小化的大规模证据。

Large-scale evidence of dependency length minimization in 37 languages.

作者信息

Futrell Richard, Mahowald Kyle, Gibson Edward

机构信息

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.

出版信息

Proc Natl Acad Sci U S A. 2015 Aug 18;112(33):10336-41. doi: 10.1073/pnas.1502134112. Epub 2015 Aug 3.

Abstract

Explaining the variation between human languages and the constraints on that variation is a core goal of linguistics. In the last 20 y, it has been claimed that many striking universals of cross-linguistic variation follow from a hypothetical principle that dependency length--the distance between syntactically related words in a sentence--is minimized. Various models of human sentence production and comprehension predict that long dependencies are difficult or inefficient to process; minimizing dependency length thus enables effective communication without incurring processing difficulty. However, despite widespread application of this idea in theoretical, empirical, and practical work, there is not yet large-scale evidence that dependency length is actually minimized in real utterances across many languages; previous work has focused either on a small number of languages or on limited kinds of data about each language. Here, using parsed corpora of 37 diverse languages, we show that overall dependency lengths for all languages are shorter than conservative random baselines. The results strongly suggest that dependency length minimization is a universal quantitative property of human languages and support explanations of linguistic variation in terms of general properties of human information processing.

摘要

解释人类语言之间的差异以及这种差异所受到的限制是语言学的核心目标。在过去20年里,有人声称许多引人注目的跨语言变异共性源自一个假设原则,即依存长度(句子中句法相关词语之间的距离)被最小化。各种人类句子生成和理解模型预测,长依存关系难以处理或处理效率低下;因此,最小化依存长度能够实现有效的交流,而不会带来处理困难。然而,尽管这一观点在理论、实证和实际工作中得到了广泛应用,但尚无大规模证据表明在多种语言的实际话语中依存长度实际上被最小化了;先前的研究要么集中在少数几种语言上,要么集中在每种语言的有限类型的数据上。在这里,我们使用37种不同语言的已解析语料库表明,所有语言的总体依存长度都短于保守的随机基线。结果有力地表明,依存长度最小化是人类语言的一种普遍量化属性,并支持从人类信息处理的一般属性角度对语言变异进行解释。

相似文献

1
Large-scale evidence of dependency length minimization in 37 languages.
Proc Natl Acad Sci U S A. 2015 Aug 18;112(33):10336-41. doi: 10.1073/pnas.1502134112. Epub 2015 Aug 3.
2
Dependency distance: A new perspective on syntactic patterns in natural languages.
Phys Life Rev. 2017 Jul;21:171-193. doi: 10.1016/j.plrev.2017.03.002. Epub 2017 Mar 27.
4
DDM at Work: Reply to comments on "Dependency distance: A new perspective on syntactic patterns in natural languages".
Phys Life Rev. 2017 Jul;21:233-240. doi: 10.1016/j.plrev.2017.07.001. Epub 2017 Jul 19.
5
Human Information Processing Shapes Language Change.
Psychol Sci. 2018 Jan;29(1):72-82. doi: 10.1177/0956797617728726. Epub 2017 Dec 1.
6
Quantitative standards for absolute linguistic universals.
Cogn Sci. 2014 May-Jun;38(4):736-56. doi: 10.1111/cogs.12088. Epub 2013 Oct 3.
7
The myth of language universals: language diversity and its importance for cognitive science.
Behav Brain Sci. 2009 Oct;32(5):429-48; discussion 448-494. doi: 10.1017/S0140525X0999094X.
10
Universals and cultural variation in turn-taking in conversation.
Proc Natl Acad Sci U S A. 2009 Jun 30;106(26):10587-92. doi: 10.1073/pnas.0903616106. Epub 2009 Jun 24.

引用本文的文献

1
Language-specific neural dynamics extend syntax into the time domain.
PLoS Biol. 2025 Jan 21;23(1):e3002968. doi: 10.1371/journal.pbio.3002968. eCollection 2025 Jan.
2
Contrasting Linguistic Patterns in Human and LLM-Generated News Text.
Artif Intell Rev. 2024;57(10):265. doi: 10.1007/s10462-024-10903-2. Epub 2024 Aug 23.
3
Even laypeople use legalese.
Proc Natl Acad Sci U S A. 2024 Aug 27;121(35):e2405564121. doi: 10.1073/pnas.2405564121. Epub 2024 Aug 19.
4
Language is primarily a tool for communication rather than thought.
Nature. 2024 Jun;630(8017):575-586. doi: 10.1038/s41586-024-07522-w. Epub 2024 Jun 19.
5
Chunk boundaries disrupt dependency processing in an AG: Reconciling incremental processing and discrete sampling.
PLoS One. 2024 Jun 18;19(6):e0305333. doi: 10.1371/journal.pone.0305333. eCollection 2024.
6
Language transfer in L2 academic writings: a dependency grammar approach.
Front Psychol. 2024 May 9;15:1384629. doi: 10.3389/fpsyg.2024.1384629. eCollection 2024.
7
Signal Smoothing and Syntactic Choices: A Critical Reflection on the UID Hypothesis.
Open Mind (Camb). 2024 Mar 5;8:217-234. doi: 10.1162/opmi_a_00125. eCollection 2024.
8
Periodic fluctuations in reading times reflect multi-word-chunking.
Sci Rep. 2023 Oct 28;13(1):18522. doi: 10.1038/s41598-023-45536-y.
9
Task-based explanation for genre effects: Evidence from a dependency treebank.
PLoS One. 2023 Aug 23;18(8):e0290381. doi: 10.1371/journal.pone.0290381. eCollection 2023.
10
Phylogenetic insight into the origin of tones.
Proc Biol Sci. 2023 Jul 12;290(2002):20230606. doi: 10.1098/rspb.2023.0606. Epub 2023 Jul 5.

本文引用的文献

1
Language learners restructure their input to facilitate efficient communication.
Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):17897-902. doi: 10.1073/pnas.1215776109. Epub 2012 Oct 15.
2
Kinship categories across languages reflect general communicative principles.
Science. 2012 May 25;336(6084):1049-54. doi: 10.1126/science.1218811.
3
Consequences of the serial nature of linguistic input for sentenial complexity.
Cogn Sci. 2005 Mar 4;29(2):261-90. doi: 10.1207/s15516709cog0000_7.
4
Do grammars minimize dependency length?
Cogn Sci. 2010 Mar;34(2):286-310. doi: 10.1111/j.1551-6709.2009.01073.x. Epub 2009 Oct 27.
5
Word lengths are optimized for efficient communication.
Proc Natl Acad Sci U S A. 2011 Mar 1;108(9):3526-9. doi: 10.1073/pnas.1012551108. Epub 2011 Jan 28.
6
Euclidean distance between syntactically linked words.
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Nov;70(5 Pt 2):056135. doi: 10.1103/PhysRevE.70.056135. Epub 2004 Nov 30.
7
"Long before short" preference in the production of a head-final language.
Cognition. 2001 Sep;81(2):B45-55. doi: 10.1016/s0010-0277(01)00121-4.
8
Locality and parsing complexity.
J Psycholinguist Res. 2000 Nov;29(6):627-45. doi: 10.1023/a:1026528912821.
9
Linguistic complexity: locality of syntactic dependencies.
Cognition. 1998 Aug;68(1):1-76. doi: 10.1016/s0010-0277(98)00034-1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验