Yang Z
Department of Integrative Biology, University of California, Berkeley 94720-3140, USA.
Genetics. 1996 Dec;144(4):1941-50. doi: 10.1093/genetics/144.4.1941.
Statistical properties of a DNA sample from a random-mating population of constant size are studied under the finite-sites model. It is assumed that there is no migration and no recombination occurs within the locus. A Markov process model is used for nucleotide substitution, allowing for multiple substitutions at a single site. The evolutionary rates among sites are treated as either constant or variable. The general likelihood calculation using numerical integration involves intensive computation and is feasible for three or four sequences only, it may be used for validating approximate algorithms. Methods are developed to approximate the probability distribution of the number of segregating sites in a random sample of n sequences, with either constant or variable substitution rates across sites. Calculations using parameter estimates obtained for human D-loop mitochondrial DNAs show that among-site rate variation has a major effect on the distribution of the number of segregating sites; the distribution under the finite-sites model with variable rates among sites is quite different from that under the infinite-sites model.
在有限位点模型下研究了来自大小恒定的随机交配群体的DNA样本的统计特性。假设没有迁移且位点内不发生重组。使用马尔可夫过程模型进行核苷酸替换,允许单个位点发生多次替换。位点间的进化速率被视为恒定或可变。使用数值积分的一般似然计算涉及大量计算,仅适用于三或四个序列,可用于验证近似算法。已开发出方法来近似n个序列的随机样本中分离位点数量的概率分布,位点间的替换率可以是恒定的或可变的。使用从人类D环线粒体DNA获得的参数估计进行的计算表明,位点间的速率变化对分离位点数量的分布有重大影响;在位点间速率可变的有限位点模型下的分布与无限位点模型下的分布有很大不同。