Rogers A R, Jorde L B
Department of Anthropology, University of Utah, Salt Lake City, Utah 84112, USA.
Am J Hum Genet. 1996 May;58(5):1033-41.
Population geneticists work with a nonrandom sample of the human genome. Conventional practice ensures that unusually variable loci are most likely to be discovered and thus included in the sample of loci. Consequently, estimates of average heterozygosity are biased upward. In what follows we describe a model of this bias. When the mutation rate varies among loci, bias is increased. This effect is only moderate, however, so that a model of invariant mutation rates provides a reasonable approximation. Bias is pronounced when estimated heterozygosity is < approximately 35% Consequently, it probably affects estimates from classical polymorphisms as well as from restriction-site polymorphisms. Estimates from short-tandem-repeat polymorphisms have negligible bias, because of their high heterozygosity. Bias should vary not only among categories of polymorphism but also among populations. It should be largest in European populations, since these are the populations in which most polymorphisms were discovered. As this argument predicts, European estimates exceed those of Africa and Asia at systems with large bias. The magnitude of this European excess is consistent with the version of our model in which mutation rates vary across loci.
群体遗传学家研究的是人类基因组的非随机样本。传统做法确保了异常可变的基因座最有可能被发现,从而被纳入基因座样本中。因此,平均杂合度的估计值会向上偏差。接下来我们描述这种偏差的一个模型。当基因座间的突变率不同时,偏差会增大。然而,这种影响只是中等程度的,所以不变突变率模型提供了一个合理的近似。当估计的杂合度小于约35%时,偏差很明显。因此,它可能会影响来自经典多态性以及限制性位点多态性的估计。由于短串联重复多态性的高杂合度,其估计的偏差可以忽略不计。偏差不仅应在多态性类别之间有所不同,在不同人群中也应有所不同。在欧洲人群中偏差应该最大,因为这些是发现大多数多态性的人群。正如这个论点所预测的,在偏差大的系统中,欧洲的估计值超过非洲和亚洲的估计值。欧洲超出部分的幅度与我们模型中突变率在不同基因座间变化的版本一致。