Department of Physics and State Key Laboratory of Surface Physics, Fudan University, Shanghai, China.
PLoS One. 2013 Sep 2;8(9):e74515. doi: 10.1371/journal.pone.0074515. eCollection 2013.
Rankings are ubiquitous around the world. Here I investigate spatial ranking patterns of English Words and Chinese Characters, and reveal a common construction pattern related to phase separation. In detail, I analyze a list of different words in the English language, and find that the frequency of the number of letters per word linearly or nonlinearly decays over its rank in the frequency table. I interpret the linearly decaying area as a linear phase that covers 96.4% words, which is in sharp contrast to a nonlinear phase (representing the nonlinearly decaying area) that covers the remaining 3.6% words. Amazingly, the phase separation phenomenon with the same two percentages of 96.4% and 3.6% holds also for the relation between strokes and characters in the Chinese language although English and Chinese are two distinctly different language systems. The common construction pattern originates from the log-normal distributions of frequencies of words or characters, which can be understood by the joint effect of both the Weber-Fechner law in psychophysics and the principle of maximum entropy in information theory.
排名在世界各地无处不在。在这里,我研究了英语单词和汉字的空间排名模式,并揭示了与相分离相关的一种常见构造模式。具体来说,我分析了英语语言中的不同单词列表,发现单词的每个字母的数量频率与其在频率表中的排名呈线性或非线性衰减。我将线性衰减区域解释为覆盖 96.4%单词的线性相,这与覆盖其余 3.6%单词的非线性相(表示非线性衰减区域)形成鲜明对比。令人惊讶的是,尽管英语和汉语是两种截然不同的语言系统,但在汉语中,笔画与字符之间也存在相同的两个百分比(96.4%和 3.6%)的相分离现象。这种共同的构造模式源自于单词或字符频率的对数正态分布,可以通过心理物理学中的韦伯-费希纳定律和信息论中的最大熵原理的共同作用来理解。