我们如何使用语言？在 17 种世界语言中，词汇使用频率的共有模式。

How do we use language? Shared patterns in the frequency of word use across 17 world languages.

机构信息

School of Biological Sciences, University of Reading, Reading, UK.

出版信息

Philos Trans R Soc Lond B Biol Sci. 2011 Apr 12;366(1567):1101-7. doi: 10.1098/rstb.2010.0315.

DOI:10.1098/rstb.2010.0315

PMID:21357232

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3049087/

Abstract

We present data from 17 languages on the frequency with which a common set of words is used in everyday language. The languages are drawn from six language families representing 65 per cent of the world's 7000 languages. Our data were collected from linguistic corpora that record frequencies of use for the 200 meanings in the widely used Swadesh fundamental vocabulary. Our interest is to assess evidence for shared patterns of language use around the world, and for the relationship of language use to rates of lexical replacement, defined as the replacement of a word by a new unrelated or non-cognate word. Frequencies of use for words in the Swadesh list range from just a few per million words of speech to 191 000 or more. The average inter-correlation among languages in the frequency of use across the 200 words is 0.73 (p < 0.0001). The first principal component of these data accounts for 70 per cent of the variance in frequency of use. Elsewhere, we have shown that frequently used words in the Indo-European languages tend to be more conserved, and that this relationship holds separately for different parts of speech. A regression model combining the principal factor loadings derived from the worldwide sample along with their part of speech predicts 46 per cent of the variance in the rates of lexical replacement in the Indo-European languages. This suggests that Indo-European lexical replacement rates might be broadly representative of worldwide rates of change. Evidence for this speculation comes from using the same factor loadings and part-of-speech categories to predict a word's position in a list of 110 words ranked from slowest to most rapidly evolving among 14 of the world's language families. This regression model accounts for 30 per cent of the variance. Our results point to a remarkable regularity in the way that human speakers use language, and hint that the words for a shared set of meanings have been slowly evolving and others more rapidly evolving throughout human history.

摘要

我们呈现了来自 17 种语言的数据，这些语言在日常语言中使用一组常见词汇的频率。这些语言来自代表世界上 7000 种语言的 65%的六种语言家族。我们的数据来自记录 Swadesh 基本词汇中 200 个含义使用频率的语言语料库。我们的兴趣是评估世界各地语言使用模式的共享证据，以及语言使用与词汇替换率之间的关系，词汇替换率定义为用一个新词替换一个旧词，新词与旧词没有关联或非同源。Swadesh 词汇表中的单词使用频率从每百万个单词中只有几个到 191000 个或更多不等。在 200 个单词的使用频率方面，语言之间的平均相互相关性为 0.73（p<0.0001）。这些数据的第一主成分解释了使用频率方差的 70%。在其他地方，我们已经表明，印欧语言中经常使用的单词往往更保守，而且这种关系在不同的词性中单独成立。一个将来自全球样本的主要因子负荷与它们的词性相结合的回归模型，预测了印欧语言中词汇替换率的 46%的方差。这表明印欧词汇替换率可能广泛代表全球变化率。这种推测的证据来自于使用相同的因子负荷和词性类别来预测 110 个单词列表中一个单词的位置，该列表是根据 14 种世界语言家族中最慢和最快进化的单词排名的。这个回归模型解释了 30%的方差。我们的结果指向人类说话者使用语言的一种显著规律性，并暗示共享词义的单词在人类历史上一直在缓慢进化，而其他单词则在快速进化。

相似文献

How do we use language? Shared patterns in the frequency of word use across 17 world languages.我们如何使用语言？在 17 种世界语言中，词汇使用频率的共有模式。

Philos Trans R Soc Lond B Biol Sci. 2011 Apr 12;366(1567):1101-7. doi: 10.1098/rstb.2010.0315.

Frequency of word-use predicts rates of lexical evolution throughout Indo-European history.词汇使用频率预测了整个印欧语系历史中的词汇演变速度。

Nature. 2007 Oct 11;449(7163):717-20. doi: 10.1038/nature06176.

The deep history of the number words.数字词汇的深远历史。

Philos Trans R Soc Lond B Biol Sci. 2017 Feb 19;373(1740). doi: 10.1098/rstb.2016.0517.

Predicting Age of Acquisition for Children's Early Vocabulary in Five Languages Using Language Model Surprisal.使用语言模型惊讶度预测五种语言中儿童早期词汇的习得年龄。

Cogn Sci. 2023 Sep;47(9):e13334. doi: 10.1111/cogs.13334.

Semantic Factors Predict the Rate of Lexical Replacement of Content Words.语义因素预测实词的词汇替换率。

PLoS One. 2016 Jan 28;11(1):e0147924. doi: 10.1371/journal.pone.0147924. eCollection 2016.

Using hybridization networks to retrace the evolution of Indo-European languages.利用杂交网络追溯印欧语系语言的演变。

BMC Evol Biol. 2016 Sep 6;16(1):180. doi: 10.1186/s12862-016-0745-6.

Networks uncover hidden lexical borrowing in Indo-European language evolution.网络揭示印欧语系语言演变中的隐性词汇借用。

Proc Biol Sci. 2011 Jun 22;278(1713):1794-803. doi: 10.1098/rspb.2010.1917. Epub 2010 Nov 24.

Ultraconserved words point to deep language ancestry across Eurasia.超保守词语指向欧亚大陆的深远语言渊源。

Proc Natl Acad Sci U S A. 2013 May 21;110(21):8471-6. doi: 10.1073/pnas.1218726110. Epub 2013 May 6.

Non-Arbitrariness in Mapping Word Form to Meaning: Cross-Linguistic Formal Markers of Word Concreteness.词形与意义映射中的非任意性：词具体性的跨语言形式标记

Cogn Sci. 2017 May;41(4):1071-1089. doi: 10.1111/cogs.12361. Epub 2016 Mar 14.

Iconicity in English and Spanish and Its Relation to Lexical Category and Age of Acquisition.英语和西班牙语中的象似性及其与词汇类别和习得年龄的关系。

PLoS One. 2015 Sep 4;10(9):e0137147. doi: 10.1371/journal.pone.0137147. eCollection 2015.

引用本文的文献

Multiple evolutionary pressures shape identical consonant avoidance in the world's languages.多种进化压力塑造了世界语言中相同的辅音回避现象。

Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2316677121. doi: 10.1073/pnas.2316677121. Epub 2024 Jun 25.

Cross-linguistic conditions on word length.跨语言条件下的单词长度。

PLoS One. 2023 Jan 27;18(1):e0281041. doi: 10.1371/journal.pone.0281041. eCollection 2023.

The sound of swearing: Are there universal patterns in profanity?咒骂声：脏话是否存在普遍模式？

Psychon Bull Rev. 2023 Jun;30(3):1103-1114. doi: 10.3758/s13423-022-02202-0. Epub 2022 Dec 6.

Cultural transmission of traditional songs in the Ryukyu Archipelago.琉球群岛传统歌曲的文化传承。

PLoS One. 2022 Jun 24;17(6):e0270354. doi: 10.1371/journal.pone.0270354. eCollection 2022.

The history of number words in the world's languages-what have we learnt so far?世界语言中的数字词汇史——到目前为止我们学到了什么？

Philos Trans R Soc Lond B Biol Sci. 2021 May 10;376(1824):20200206. doi: 10.1098/rstb.2020.0206. Epub 2021 Mar 22.

Speech adapts to differences in dentition within and across populations.言语会根据个体和群体在牙列方面的差异进行调整。

Sci Rep. 2021 Jan 13;11(1):1066. doi: 10.1038/s41598-020-80190-8.

Usage frequency and lexical class determine the evolution of kinship terms in Indo-European.使用频率和词汇类别决定了印欧语系亲属称谓的演变。

R Soc Open Sci. 2019 Oct 30;6(10):191385. doi: 10.1098/rsos.191385. eCollection 2019 Oct.

Languages in Drier Climates Use Fewer Vowels.气候较为干燥地区的语言使用的元音较少。

Front Psychol. 2017 Jul 27;8:1285. doi: 10.3389/fpsyg.2017.01285. eCollection 2017.

Darwinian perspectives on the evolution of human languages.关于人类语言进化的达尔文观点。

Psychon Bull Rev. 2017 Feb;24(1):151-157. doi: 10.3758/s13423-016-1072-z.

Semantic Factors Predict the Rate of Lexical Replacement of Content Words.语义因素预测实词的词汇替换率。

PLoS One. 2016 Jan 28;11(1):e0147924. doi: 10.1371/journal.pone.0147924. eCollection 2016.

本文引用的文献

Human language as a culturally transmitted replicator.人类语言作为一种文化传递的复制因子。

Nat Rev Genet. 2009 Jun;10(6):405-15. doi: 10.1038/nrg2560.

Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East.对闪米特语进行的贝叶斯系统发育分析确定了闪米特语在青铜时代早期起源于近东地区。

Proc Biol Sci. 2009 Aug 7;276(1668):2703-10. doi: 10.1098/rspb.2009.0408. Epub 2009 Apr 29.

Language phylogenies reveal expansion pulses and pauses in Pacific settlement.语言系统发育揭示了太平洋定居点的扩张脉冲和停滞。

Science. 2009 Jan 23;323(5913):479-83. doi: 10.1126/science.1166858.

Rise of the digital machine.数字机器的崛起。

Nature. 2008 Apr 10;452(7188):699. doi: 10.1038/452699a.

Frequency of word-use predicts rates of lexical evolution throughout Indo-European history.词汇使用频率预测了整个印欧语系历史中的词汇演变速度。

Nature. 2007 Oct 11;449(7163):717-20. doi: 10.1038/nature06176.

Models of high-dimensional semantic space predict language-mediated eye movements in the visual world.高维语义空间模型预测视觉世界中语言介导的眼动。

Acta Psychol (Amst). 2006 Jan;121(1):65-80. doi: 10.1016/j.actpsy.2005.06.002. Epub 2005 Aug 11.

Language-tree divergence times support the Anatolian theory of Indo-European origin.语言树的分化时间支持印欧语系起源的安纳托利亚理论。

Nature. 2003 Nov 27;426(6965):435-9. doi: 10.1038/nature02029.

Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximum-parsimony analysis.班图语系树状图反映了农业在撒哈拉以南非洲地区的传播：一种最大简约法分析。

Proc Biol Sci. 2002 Apr 22;269(1493):793-9. doi: 10.1098/rspb.2002.1955.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验