Fu Y X, Li W H
Center for Demographic and Population Genetics, University of Texas, Houston 77225.
Genetics. 1993 Aug;134(4):1261-70. doi: 10.1093/genetics/134.4.1261.
One of the most important parameters in population genetics is theta = 4Ne mu where Ne is the effective population size and mu is the rate of mutation per gene per generation. We study two related problems, using the maximum likelihood method and the theory of coalescence. One problem is the potential improvement of accuracy in estimating the parameter theta over existing methods and the other is the estimation of parameter lambda which is the ratio of two theta's. The minimum variances of estimates of the parameter theta are derived under two idealized situations. These minimum variances serve as the lower bounds of the variances of all possible estimates of theta in practice. We then show that Watterson's estimate of theta based on the number of segregating sites is asymptotically an optimal estimate of theta. However, for a finite sample of sequences, substantial improvement over Watterson's estimate is possible when theta is large. The maximum likelihood estimate of lambda = theta 1/theta 2 is obtained and the properties of the estimate are discussed.
群体遗传学中最重要的参数之一是θ = 4Neμ,其中Ne是有效群体大小,μ是每个基因每代的突变率。我们使用最大似然法和合并理论研究两个相关问题。一个问题是与现有方法相比,估计参数θ时潜在的精度提高,另一个问题是估计参数λ,它是两个θ的比值。在两种理想化情况下推导了参数θ估计的最小方差。这些最小方差在实际中作为θ所有可能估计方差的下限。然后我们表明,基于分离位点数目的Watterson对θ的估计渐近地是θ的最优估计。然而,对于有限的序列样本,当θ较大时,有可能比Watterson的估计有实质性的改进。获得了λ = θ1/θ2的最大似然估计,并讨论了该估计的性质。