Hey J
Department of Biological Sciences, Rutgers University, Nelson Laboratories, Piscataway, New Jersey 08855.
Genetics. 1991 Aug;128(4):831-40. doi: 10.1093/genetics/128.4.831.
When two samples of DNA sequences are compared, one way in which they may differ is in the presence of fixed differences, which are defined as sites at which all of the sequences in one sample are different from all of the sequences in a second sample. The probability distribution of the number of fixed differences is developed. The theory employs Wright-Fisher genealogies and the infinite sites mutation model. For the case when both samples are drawn randomly from the same population it is found that genealogies permitting fixed differences are very unlikely. Thus the mere presence of fixed differences between samples is statistically significant, even for small samples. The theory is extended to samples from populations that have been separated for some time. The relationship between a simple Poisson distribution of mutations and the distribution of fixed differences is described as a function of the time since populations have been isolated. It is shown how these results may contribute to improved tests of recent balancing or directional selection.
当比较两个DNA序列样本时,它们可能存在差异的一种方式是存在固定差异,固定差异被定义为一个样本中的所有序列与第二个样本中的所有序列都不同的位点。推导了固定差异数量的概率分布。该理论采用赖特-费希尔谱系和无限位点突变模型。对于从同一群体中随机抽取两个样本的情况,发现允许存在固定差异的谱系非常不可能出现。因此,即使对于小样本,样本之间仅存在固定差异在统计学上也是显著的。该理论扩展到来自已经分离一段时间的群体的样本。描述了突变的简单泊松分布与固定差异分布之间的关系,该关系是群体分离后时间的函数。展示了这些结果如何有助于改进对近期平衡或定向选择的检验。