Takahata N, Nei M
Genetics. 1985 Jun;110(2):325-44. doi: 10.1093/genetics/110.2.325.
A mathematical theory is developed for computing the probability that m genes sampled from one population (species) and n genes sampled from another are derived from l genes that existed at the time of population splitting. The expected time of divergence between the two most closely related genes sampled from two different populations and the time of divergence (coalescence) of all genes sampled are studied by using this theory. It is shown that the time of divergence between the two most closely related genes can be used as an approximate estimate of the time of population splitting (T) only when T identical to t/(2N) is small, where t and N are the number of generations and the effective population size, respectively. The variance of Nei and Li's estimate (d) of the number of net nucleotide differences between two populations is also studied. It is shown that the standard error (Sd) of d is larger than the mean when T is small (T much less than 1). In this case, Sd is reduced considerably by increasing sample size. When T is large (T greater than 1), however, a large proportion of the variance of d is caused by stochastic factors, and increase in the sample size does not help to reduce Sd. To reduce the stochastic variance of d, one must use data from many independent unlinked gene loci.
本文建立了一种数学理论,用于计算从一个种群(物种)中抽取的m个基因和从另一个种群中抽取的n个基因源自种群分化时存在的l个基因的概率。利用该理论研究了从两个不同种群中抽取的两个亲缘关系最近的基因之间的预期分化时间以及所有抽取基因的分化(合并)时间。结果表明,只有当T = t/(2N)很小时,两个亲缘关系最近的基因之间的分化时间才能用作种群分化时间(T)的近似估计,其中t和N分别是世代数和有效种群大小。还研究了两个种群之间净核苷酸差异数的Nei和Li估计值(d)的方差。结果表明,当T很小时(T远小于1),d的标准误差(Sd)大于其均值。在这种情况下,通过增加样本量可大幅降低Sd。然而,当T很大时(T大于1),d的很大一部分方差是由随机因素引起的,增加样本量无助于降低Sd。为了降低d的随机方差,必须使用来自许多独立非连锁基因座的数据。