Center for Demographic and Population Genetics, University of Texas Health Science Center at Houston, Houston, Texas 77025.
Genetics. 1978 Oct;90(2):349-82. doi: 10.1093/genetics/90.2.349.
Formulae are developed for the distribution of allele frequencies (the frequency spectrum), the mean number of alleles in a sample, and the mean and variance of heterozygosity under mutation pressure and under either genic or recessive selection. Numerical computations are carried out by using these formulae and Watterson's (1977) formula for the distribution of allele frequencies under overdominant selection. The following properties are observed: (1) The effect of selection on the distribution of allele frequencies is slight when 4Ns </= 4, but becomes strong when 4Ns becomes larger than 10, where N denotes the effective size and s the selective difference between alleles. Genic selection and recessive selection tend to force the distribution to be U-shaped, whereas overdominant selection has the opposite tendency. (2) The mean total number of alleles in a sample is much more strongly affected by selection than the mean number of rare alleles in a sample. (3) Even slight heterozygote advantage, as small as 10(-5), increases considerably the mean heterozygosity of a population, as compared to the case of neutral mutations. On the other hand, even slight genic or recessive selection causes a great reduction in heterozygosity when population size is large. (4) As a test statistic, the variance of heterozygosity can be used to detect the presence of selection, though it is not efficient when the selection intensity is very weak, say when 4Ns is around 4 or less. A model, which is somewhat similar to Ohta's (1976) model of slightly deleterious mutations, has been proposed to explain the following general patterns of genic variation: (i) There seems to be an upper limit for the observed average heterozygosities. (ii) The distribution of allele frequencies is U-shaped for every species surveyed. (iii) Most of the species surveyed tend to have an excess of rare alleles as compared with that expected under the neutral mutation hypothesis.
公式被开发出来以用于分配等位基因频率(频率谱)、样本中平均等位基因数量以及突变压力下和显性或隐性选择下杂合度的均值和方差。通过使用这些公式和 Watterson(1977)在超显性选择下等位基因频率分布的公式进行数值计算。观察到以下性质:(1)当 4Ns <= 4 时,选择对等位基因频率分布的影响很小,但当 4Ns 大于 10 时,影响变得很强,其中 N 表示有效大小,s 是等位基因之间的选择差异。显性选择和隐性选择倾向于迫使分布呈 U 形,而超显性选择则有相反的趋势。(2)与样本中稀有等位基因的平均数量相比,选择对样本中总等位基因数量的影响要大得多。(3)即使是轻微的杂合优势,如 10^(-5),也会使种群的平均杂合度大大增加,与中性突变相比。另一方面,即使是轻微的显性或隐性选择,当种群规模较大时,也会导致杂合度大幅降低。(4)作为一个检验统计量,杂合度的方差可以用于检测选择的存在,尽管当选择强度非常弱时,例如当 4Ns 约为 4 或更小,它的效率不高。已经提出了一个与 Ohta(1976)的轻微有害突变模型有些相似的模型,以解释以下基因变异的一般模式:(i)观察到的平均杂合度似乎存在上限。(ii)每个调查物种的等位基因频率分布呈 U 形。(iii)与中性突变假设下预期的相比,大多数调查物种倾向于具有过多的稀有等位基因。