Kruglyak S, Durrett R T, Schug M D, Aquadro C F
School of Operations Research and Industrial Engineering, Rhodes Hall, Biotechnology Building, Cornell University, Ithaca, NY 14853, USA.
Proc Natl Acad Sci U S A. 1998 Sep 1;95(18):10774-8. doi: 10.1073/pnas.95.18.10774.
We describe and test a Markov chain model of microsatellite evolution that can explain the different distributions of microsatellite lengths across different organisms and repeat motifs. Two key features of this model are the dependence of mutation rates on microsatellite length and a mutation process that includes both strand slippage and point mutation events. We compute the stationary distribution of allele lengths under this model and use it to fit DNA data for di-, tri-, and tetranucleotide repeats in humans, mice, fruit flies, and yeast. The best fit results lead to slippage rate estimates that are highest in mice, followed by humans, then yeast, and then fruit flies. Within each organism, the estimates are highest in di-, then tri-, and then tetranucleotide repeats. Our estimates are consistent with experimentally determined mutation rates from other studies. The results suggest that the different length distributions among organisms and repeat motifs can be explained by a simple difference in slippage rates and that selective constraints on length need not be imposed.
我们描述并测试了一种微卫星进化的马尔可夫链模型,该模型能够解释不同生物体中微卫星长度以及重复基序的不同分布情况。此模型的两个关键特征是突变率对微卫星长度的依赖性以及一个既包含链滑动又包含点突变事件的突变过程。我们计算了该模型下等位基因长度的平稳分布,并将其用于拟合人类、小鼠、果蝇和酵母中二核苷酸、三核苷酸和四核苷酸重复序列的DNA数据。最佳拟合结果得出的滑动率估计值在小鼠中最高,其次是人类,然后是酵母,最后是果蝇。在每个生物体中,二核苷酸重复序列的估计值最高,其次是三核苷酸重复序列,然后是四核苷酸重复序列。我们的估计值与其他研究通过实验确定的突变率一致。结果表明,生物体和重复基序之间不同的长度分布可以通过滑动率的简单差异来解释,并且无需对长度施加选择性限制。