King Adam, Wedel Andrew
Department of Linguistics, University of Arizona.
Open Mind (Camb). 2020 Mar;4:1-12. doi: 10.1162/opmi_a_00030.
There has been much work over the last century on optimization of the lexicon for efficient communication, with a particular focus on the form of words as an evolving balance between production ease and communicative accuracy. Zipf's law of abbreviation, the cross-linguistic trend for less-probable words to be longer, represents some of the strongest evidence the lexicon is shaped by a pressure for communicative efficiency. However, the various sounds that make up words do not all contribute the same amount of disambiguating information to a listener. Rather, the information a sound contributes depends in part on what specific lexical competitors exist in the lexicon. In addition, because the speech stream is perceived incrementally, early sounds in a word contribute on average more information than later sounds. Using a dataset of diverse languages, we demonstrate that, above and beyond containing more sounds, less-probable words contain sounds that convey more disambiguating information overall. We show further that this pattern tends to be strongest at word-beginnings, where sounds can contribute the most information.
在上个世纪,人们为了实现高效交流对词汇进行了大量优化工作,特别关注单词形式,将其视为在生成便利性和交际准确性之间不断演变的平衡。齐普夫缩写定律,即不太常见的单词在跨语言中往往更长的趋势,是词汇受交际效率压力影响形成的一些最有力证据。然而,构成单词的各种声音对听众消除歧义信息的贡献并不相同。相反,一个声音所贡献的信息部分取决于词汇中存在哪些特定的词汇竞争者。此外,由于语音流是逐步被感知的,单词中靠前的声音平均比靠后的声音贡献更多信息。我们使用多种语言的数据集证明,除了包含更多声音外,不太常见的单词所包含的声音总体上能传达更多消除歧义的信息。我们进一步表明,这种模式在单词开头往往最为明显,因为在那里声音能贡献最多信息。