Saitou N, Nei M
J Mol Evol. 1986;24(1-2):189-204. doi: 10.1007/BF02099966.
A mathematical theory for computing the probabilities of various nucleotide configurations among related species is developed, and the probability of obtaining the correct tree (topology) from nucleotide sequence data is evaluated using models of evolutionary trees that are close to the tree of mitochondrial DNAs from human, chimpanzee, gorilla, orangutan, and gibbon. Special attention is given to the number of nucleotides required to resolve the branching order among the three most closely related organisms (human, chimpanzee, and gorilla). If the extent of DNA divergence is close to that obtained by Brown et al. for mitochondrial DNA and if sequence data are available only for the three most closely related organisms, the number of nucleotides (m*) required to obtain the correct tree with a probability of 95% is about 4700. If sequence data for two outgroup species (orangutan and gibbon) are available, m* becomes about 2600-2700 when the transformed distance, distance-Wagner, maximum parsimony, or compatibility method is used. In the unweighted pair-group method, m* is not affected by the availability of data from outgroup species. When these five different tree-making methods, as well as Fitch and Margoliash's method, are applied to the mitochondrial DNA data (1834 bp) obtained by Brown et al. and by Hixson and Brown, they all give the same phylogenetic tree, in which human and chimpanzee are most closely related. However, the trees considered here are "gene trees," and to obtain the correct "species tree," sequence data for several independent loci must be used.
我们开发了一种数学理论,用于计算相关物种中各种核苷酸构型的概率,并使用与人类、黑猩猩、大猩猩、猩猩和长臂猿的线粒体DNA树相近的进化树模型,评估从核苷酸序列数据中获得正确树(拓扑结构)的概率。特别关注解析三个亲缘关系最近的生物体(人类、黑猩猩和大猩猩)之间分支顺序所需的核苷酸数量。如果DNA分歧程度接近布朗等人对线粒体DNA所获得的程度,并且序列数据仅适用于三个亲缘关系最近的生物体,那么以95%的概率获得正确树所需的核苷酸数量(m*)约为4700。如果有两个外类群物种(猩猩和长臂猿)的序列数据,当使用转换距离、距离-瓦格纳、最大简约法或相容性方法时,m约为2600 - 2700。在非加权配对组法中,m不受外类群物种数据可用性的影响。当将这五种不同的建树方法以及菲奇和马戈利阿什的方法应用于布朗等人以及希克森和布朗获得的线粒体DNA数据(1834 bp)时,它们都给出了相同的系统发育树,其中人类和黑猩猩的关系最为密切。然而,这里考虑的树是“基因树”,要获得正确的“物种树”,必须使用几个独立基因座的序列数据。