• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

词序与词结构之间的统计权衡——省力原则的大规模证据。

The statistical trade-off between word order and word structure - Large-scale evidence for the principle of least effort.

作者信息

Koplenig Alexander, Meyer Peter, Wolfer Sascha, Müller-Spitzer Carolin

机构信息

Institute for the German Language (IDS), Mannheim, Germany.

出版信息

PLoS One. 2017 Mar 10;12(3):e0173614. doi: 10.1371/journal.pone.0173614. eCollection 2017.

DOI:10.1371/journal.pone.0173614
PMID:28282435
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5345836/
Abstract

Languages employ different strategies to transmit structural and grammatical information. While, for example, grammatical dependency relationships in sentences are mainly conveyed by the ordering of the words for languages like Mandarin Chinese, or Vietnamese, the word ordering is much less restricted for languages such as Inupiatun or Quechua, as these languages (also) use the internal structure of words (e.g. inflectional morphology) to mark grammatical relationships in a sentence. Based on a quantitative analysis of more than 1,500 unique translations of different books of the Bible in almost 1,200 different languages that are spoken as a native language by approximately 6 billion people (more than 80% of the world population), we present large-scale evidence for a statistical trade-off between the amount of information conveyed by the ordering of words and the amount of information conveyed by internal word structure: languages that rely more strongly on word order information tend to rely less on word structure information and vice versa. Or put differently, if less information is carried within the word, more information has to be spread among words in order to communicate successfully. In addition, we find that-despite differences in the way information is expressed-there is also evidence for a trade-off between different books of the biblical canon that recurs with little variation across languages: the more informative the word order of the book, the less informative its word structure and vice versa. We argue that this might suggest that, on the one hand, languages encode information in very different (but efficient) ways. On the other hand, content-related and stylistic features are statistically encoded in very similar ways.

摘要

不同语言采用不同策略来传递结构和语法信息。例如,对于像汉语普通话或越南语这样的语言,句子中的语法依存关系主要通过词序来传达,而对于像因纽特语或克丘亚语这样的语言,词序的限制则少得多,因为这些语言还利用词的内部结构(如屈折形态学)来标记句子中的语法关系。基于对近1200种不同语言中《圣经》不同书卷的1500多个独特译本的定量分析,这些语言为全球约60亿人(超过世界人口的80%)的母语,我们提供了大规模证据,证明词序传达的信息量与词内部结构传达的信息量之间存在统计上的权衡:更依赖词序信息的语言往往较少依赖词结构信息,反之亦然。或者换句话说,如果词内携带的信息较少,那么为了成功交流,就必须在词之间传播更多信息。此外,我们发现,尽管信息表达的方式存在差异,但不同《圣经》书卷之间也存在权衡的证据,这种权衡在不同语言中几乎没有变化地反复出现:书卷的词序信息越丰富,其词结构信息就越少,反之亦然。我们认为,这可能表明,一方面,语言以非常不同(但高效)的方式编码信息。另一方面,与内容相关和文体特征在统计上以非常相似的方式编码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/58d12cab1a0d/pone.0173614.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/51f151afc2ee/pone.0173614.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/ebf842c826b2/pone.0173614.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/c9f95b15d61f/pone.0173614.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/dc0313edd172/pone.0173614.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/1f696943068c/pone.0173614.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/d81b87334f9e/pone.0173614.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/58d12cab1a0d/pone.0173614.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/51f151afc2ee/pone.0173614.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/ebf842c826b2/pone.0173614.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/c9f95b15d61f/pone.0173614.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/dc0313edd172/pone.0173614.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/1f696943068c/pone.0173614.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/d81b87334f9e/pone.0173614.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e134/5345836/58d12cab1a0d/pone.0173614.g007.jpg

相似文献

1
The statistical trade-off between word order and word structure - Large-scale evidence for the principle of least effort.词序与词结构之间的统计权衡——省力原则的大规模证据。
PLoS One. 2017 Mar 10;12(3):e0173614. doi: 10.1371/journal.pone.0173614. eCollection 2017.
2
Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms.适应性交流:非母语使用者较多的语言往往词形较少。
PLoS One. 2015 Jun 17;10(6):e0128254. doi: 10.1371/journal.pone.0128254. eCollection 2015.
3
Crosslinguistic word order variation reflects evolutionary pressures of dependency and information locality.跨语言的语序变化反映了依存关系和信息局部性的进化压力。
Proc Natl Acad Sci U S A. 2022 Jun 14;119(24):e2122604119. doi: 10.1073/pnas.2122604119. Epub 2022 Jun 8.
4
Time course of Chinese monosyllabic spoken word recognition: evidence from ERP analyses.汉语单音节词语音识别的时间进程:来自 ERP 分析的证据。
Neuropsychologia. 2011 Jun;49(7):1761-70. doi: 10.1016/j.neuropsychologia.2011.02.054. Epub 2011 Mar 4.
5
Modeling word and morpheme order in natural language as an efficient trade-off of memory and surprisal.将自然语言中的单词和词素顺序建模为记忆和惊讶之间的有效权衡。
Psychol Rev. 2021 Jul;128(4):726-756. doi: 10.1037/rev0000269. Epub 2021 Apr 1.
6
Large-scale evidence of dependency length minimization in 37 languages.37种语言中依存长度最小化的大规模证据。
Proc Natl Acad Sci U S A. 2015 Aug 18;112(33):10336-41. doi: 10.1073/pnas.1502134112. Epub 2015 Aug 3.
7
Application of elementary probability models for text homogeneity and segmentation: A case study of Bible.初等概率模型在文本同质性和分段中的应用:以《圣经》为例。
PLoS One. 2024 Jun 7;19(6):e0303432. doi: 10.1371/journal.pone.0303432. eCollection 2024.
8
Universal entropy of word ordering across linguistic families.跨语系的普遍词汇排序熵。
PLoS One. 2011;6(5):e19875. doi: 10.1371/journal.pone.0019875. Epub 2011 May 13.
9
The statistical signature of morphosyntax: a study of Hungarian and Italian infant-directed speech.形态句法的统计特征:对匈牙利语和意大利语婴儿导向语的研究。
Cognition. 2012 Nov;125(2):263-87. doi: 10.1016/j.cognition.2012.06.010. Epub 2012 Aug 6.
10
Comparison of the books of the New Testament (English Translation) in terms of emotion and word use.新约全书各卷在情感和用词方面的比较(英文翻译)
Psychol Rep. 2006 Feb;98(1):57-64. doi: 10.2466/pr0.98.1.57-64.

引用本文的文献

1
Is Word Order Responsive to Morphology? Disentangling Cause and Effect in Morphosyntactic Change in Five Western European Languages.词序对形态学有响应吗?解析五种西欧语言形态句法变化中的因果关系。
Entropy (Basel). 2025 Jan 9;27(1):53. doi: 10.3390/e27010053.
2
Informativeness across Interpreting Types: Implications for Language Shifts under Cognitive Load.不同口译类型中的信息性:认知负荷下语言转换的影响
Entropy (Basel). 2023 Jan 28;25(2):243. doi: 10.3390/e25020243.
3
Corpus-based typology: applications, challenges and some solutions.

本文引用的文献

1
Balancing Effort and Information Transmission During Language Acquisition: Evidence From Word Order and Case Marking.语言习得过程中努力与信息传递的平衡:来自词序和格标记的证据。
Cogn Sci. 2017 Mar;41(2):416-446. doi: 10.1111/cogs.12346. Epub 2016 Feb 22.
2
Universal entropy of word ordering across linguistic families.跨语系的普遍词汇排序熵。
PLoS One. 2011;6(5):e19875. doi: 10.1371/journal.pone.0019875. Epub 2011 May 13.
3
Word lengths are optimized for efficient communication.词汇长度经过优化,以实现高效沟通。
基于语料库的类型学:应用、挑战及一些解决方案。
Linguist Typol. 2022 May 25;26(1):129-160. doi: 10.1515/lingty-2020-0118. Epub 2021 Mar 30.
4
Efficiency in human languages: Corpus evidence for universal principles.人类语言中的效率:普遍原则的语料证据
Linguist Vanguard. 2021 Apr 21;7(Suppl3):20200081. doi: 10.1515/lingvan-2020-0081. eCollection 2021 May 1.
5
Grammatical Gender Disambiguates Syntactically Similar Nouns.语法性消除句法相似名词的歧义。
Entropy (Basel). 2022 Apr 7;24(4):520. doi: 10.3390/e24040520.
6
Approximate Entropy in Canonical and Non-Canonical Fiction.规范与非规范小说中的近似熵
Entropy (Basel). 2022 Feb 15;24(2):278. doi: 10.3390/e24020278.
7
Cross-Linguistic Trade-Offs and Causal Relationships Between Cues to Grammatical Subject and Object, and the Problem of Efficiency-Related Explanations.语法主语和宾语线索之间的跨语言权衡与因果关系,以及与效率相关解释的问题
Front Psychol. 2021 Jul 12;12:648200. doi: 10.3389/fpsyg.2021.648200. eCollection 2021.
8
Estimating Predictive Rate-Distortion Curves via Neural Variational Inference.通过神经变分推理估计预测率失真曲线。
Entropy (Basel). 2019 Jun 28;21(7):640. doi: 10.3390/e21070640.
9
Language structure is influenced by the number of speakers but seemingly not by the proportion of non-native speakers.语言结构受使用者数量的影响,但似乎不受非母语使用者比例的影响。
R Soc Open Sci. 2019 Feb 27;6(2):181274. doi: 10.1098/rsos.181274. eCollection 2019 Feb.
Proc Natl Acad Sci U S A. 2011 Mar 1;108(9):3526-9. doi: 10.1073/pnas.1012551108. Epub 2011 Jan 28.
4
Entropy estimation of symbol sequences.符号序列的熵估计
Chaos. 1996 Sep;6(3):414-427. doi: 10.1063/1.166191.