Curie-Cohen M
Genetics. 1982 Feb;100(2):339-58. doi: 10.1093/genetics/100.2.339.
The average inbreeding coefficient f of a population can be estimated in several different ways based solely on the genotypic frequencies at a single locus. The means and variances of four different estimates have been compared. While the four estimates are equivalent when there are two alleles, the best estimates when there are three or more alleles are based upon total heterozygosity (Formula: see text) where x and y are the expected and observed number of heterozygotes) and the proportion of alleles that are homozygous (Formula: see text) where k = the number of alleles, aii = the number of AiAi homozygotes, and 2aij = the number of AiAj heterozygotes). Both are minimally based estimates of f and have identical sampling variances when all alleles are equally frequent. However, when alleles have different frequencies, the choice between these two estimates depends on the gene frequencies and the true inbreeding coefficient of a population; f2 is the best estimate when the true average inbreeding coefficient is suspected to be low or f = 0, while f1 is best in populations with large average inbreeding coefficients. Approximate sampling variances of these two estimates are given for any f and any number of alleles with arbitrary gene frequencies; these approximations are accurate for samples as small as n = 100. The chi-square and maximum likelihood estimates of f are not as good for realistic sample sizes.
仅根据单个基因座的基因型频率,就可以通过几种不同的方法来估计一个群体的平均近交系数f。已经比较了四种不同估计值的均值和方差。当存在两个等位基因时,这四种估计值是等效的;而当存在三个或更多等位基因时,最佳估计值是基于总杂合度(公式:见正文,其中x和y分别是杂合子的预期数量和观察数量)以及纯合等位基因的比例(公式:见正文,其中k = 等位基因的数量,aii = AiAi纯合子的数量,2aij = AiAj杂合子的数量)。这两种都是f的最小基估计值,并且当所有等位基因频率相等时,它们具有相同的抽样方差。然而,当等位基因频率不同时,在这两种估计值之间的选择取决于基因频率和群体的真实近交系数;当怀疑真实平均近交系数较低或f = 0时,f2是最佳估计值,而在平均近交系数较大的群体中,f1是最佳的。对于任何f以及具有任意基因频率的任何数量的等位基因,都给出了这两种估计值的近似抽样方差;这些近似值对于小至n = 100的样本都是准确的。f的卡方估计值和最大似然估计值对于实际样本量来说并不那么好。