Department of Biology, Emory University, Atlanta, Georgia, USA.
Genome Biol Evol. 2011;3:842-50. doi: 10.1093/gbe/evr044. Epub 2011 May 13.
Mutation rate variation has the potential to bias evolutionary inference, particularly when rates become much higher than the mean. We first confirm prior work that inferred the existence of cryptic, site-specific rate variation on the basis of coincident polymorphisms-sites that are segregating in both humans and chimpanzees. Then we extend this observation to a longer evolutionary timescale by identifying sites of coincident substitutions using four species. From these data, we develop analytic theory to infer the variance and skewness of the distribution of mutation rates. Even excluding CpG dinucleotides, we find a relatively large coefficient of variation and positive skew, which suggests that, although most sites in the genome have mutation rates near the mean, the distribution contains a long right-hand tail with a small number of sites having high mutation rates. At least for primates, these quickly mutating sites are few enough that the infinite sites model in population genetics remains appropriate.
突变率的变化有可能使进化推断产生偏差,尤其是当突变率远高于平均值时。我们首先确认了之前的工作,这些工作基于一致的多态性推断出了隐匿的、特定于位点的速率变化,这些多态性在人类和黑猩猩中都在分离。然后,我们通过使用四个物种识别一致替换的位点,将这一观察结果扩展到更长的进化时间尺度。从这些数据中,我们发展了分析理论来推断突变率分布的方差和偏度。即使排除 CpG 二核苷酸,我们也发现了相对较大的变异系数和正偏度,这表明,尽管基因组中的大多数位点的突变率接近平均值,但分布中存在一小部分具有高突变率的位点。至少对于灵长类动物来说,这些快速突变的位点数量很少,以至于群体遗传学中的无限位点模型仍然适用。