Massip Florian, Arndt Peter F
Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany.
Phys Rev Lett. 2013 Apr 5;110(14):148101. doi: 10.1103/PhysRevLett.110.148101. Epub 2013 Apr 2.
Recently, an enrichment of identical matching sequences has been found in many eukaryotic genomes. Their length distribution exhibits a power law tail raising the question of what evolutionary mechanism or functional constraints would be able to shape this distribution. Here we introduce a simple and evolutionarily neutral model, which involves only point mutations and segmental duplications, and produces the same statistical features as observed for genomic data. Further, we extend a mathematical model for random stick breaking to analytically show that the exponent of the power law tail is -3 and universal as it does not depend on the microscopic details of the model.
最近,在许多真核生物基因组中发现了相同匹配序列的富集现象。它们的长度分布呈现幂律尾部,这就引发了一个问题:什么样的进化机制或功能限制能够塑造这种分布。在这里,我们引入了一个简单的、进化中性的模型,该模型仅涉及点突变和片段重复,并产生与基因组数据中观察到的相同统计特征。此外,我们扩展了一个用于随机折断棍子的数学模型,以分析表明幂律尾部的指数为-3且具有普遍性,因为它不依赖于模型的微观细节。