Reeves J H
Statistics Department, University of Georgia, Athens 30602.
J Mol Evol. 1992 Jul;35(1):17-31. doi: 10.1007/BF00160257.
Several forms of maximum likelihood models are applied to aligned amino acid sequence data coded for in the mitochondrial DNA of six species (chicken, frog, human, bovine, mouse, and rat). These models range in form from relatively simple models of the type currently used for inferring phylogenetic tree structure to models more complex than those that have been used previously. No major discrepancies between the optimal trees inferred by any of these methods are found, but there are huge differences in adequacy of fit. A very significant finding is that the fit of any of these models is vastly improved by allowing a certain proportion of the amino acid sites to be invariant. An even more important, although disquieting, finding is that none of these models fits well, as judged by standard statistical criteria. The primary reason for this is that amino acid sites undergo substitution according to a process that is very heterogeneous. Because most phylogenetic inference is accomplished by choosing the optimal tree under the assumption that a homogeneous process is acting on the sites, the potential invalidity of some such conclusions is raised by this article's results. The seriousness of this problem depends upon the robustness of the phylogenetic inferential procedure to departures from the underlying model.
几种形式的最大似然模型被应用于六种物种(鸡、青蛙、人类、牛、小鼠和大鼠)线粒体DNA编码的比对氨基酸序列数据。这些模型的形式从目前用于推断系统发育树结构的相对简单的模型到比以前使用的模型更复杂的模型不等。在这些方法推断出的最优树之间未发现重大差异,但在拟合优度方面存在巨大差异。一个非常重要的发现是,通过允许一定比例的氨基酸位点不变,这些模型中的任何一个的拟合都有了极大的改善。一个更重要的发现(尽管令人不安)是,根据标准统计标准判断,这些模型中没有一个拟合得很好。主要原因是氨基酸位点的替换过程非常不均匀。由于大多数系统发育推断是在假设位点上存在均匀过程的情况下通过选择最优树来完成的,本文的结果引发了一些此类结论可能无效的问题。这个问题的严重性取决于系统发育推断程序对偏离基础模型的稳健性。