Yang Z, Goldman N, Friday A
Department of Zoology, University of Cambridge, United Kingdom.
Mol Biol Evol. 1994 Mar;11(2):316-24. doi: 10.1093/oxfordjournals.molbev.a040112.
Using real sequence data, we evaluate the adequacy of assumptions made in evolutionary models of nucleotide substitution and the effects that these assumptions have on estimation of evolutionary trees. Two aspects of the assumptions are evaluated. The first concerns the pattern of nucleotide substitution, including equilibrium base frequencies and the transition/transversion-rate ratio. The second concerns the variation of substitution rates over sites. The maximum-likelihood estimate of tree topology appears quite robust to both these aspects of the assumptions of the models, but evaluation of the reliability of the estimated tree by using simpler, less realistic models can be misleading. Branch lengths are underestimated when simpler models of substitution are used, but the underestimation caused by ignoring rate variation over nucleotide sites is much more serious. The goodness of fit of a model is reduced by ignoring spatial rate variation, but unrealistic assumptions about the pattern of nucleotide substitution can lead to an extraordinary reduction in the likelihood. It seems that evolutionary biologists can obtain accurate estimates of certain evolutionary parameters even with an incorrect phylogeny, while systematists cannot get the right tree with confidence even when a realistic, and more complex, model of evolution is assumed.
利用真实的序列数据,我们评估了核苷酸替换进化模型中所作假设的充分性,以及这些假设对进化树估计的影响。对假设的两个方面进行了评估。第一个方面涉及核苷酸替换模式,包括平衡碱基频率和转换/颠换率比。第二个方面涉及位点间替换率的变化。树拓扑结构的最大似然估计对模型假设的这两个方面似乎都相当稳健,但使用更简单、不太现实的模型来评估估计树的可靠性可能会产生误导。当使用更简单的替换模型时,分支长度会被低估,但忽略核苷酸位点间的速率变化所导致的低估要严重得多。忽略空间速率变化会降低模型的拟合优度,但关于核苷酸替换模式的不切实际假设可能会导致似然性的大幅降低。似乎进化生物学家即使在系统发育不正确的情况下也能获得某些进化参数的准确估计,而分类学家即使假设了一个现实且更复杂的进化模型,也无法自信地得到正确的树。