Foster Peter G
Department of Zoology, The Natural History Museum, Cromwell Road, London SW7 5BD, United Kingdom.
Syst Biol. 2004 Jun;53(3):485-95. doi: 10.1080/10635150490445779.
Compositional heterogeneity among lineages can compromise phylogenetic analyses, because models in common use assume compositionally homogeneous data. Models that can accommodate compositional heterogeneity with few extra parameters are described here, and used in two examples where the true tree is known with confidence. It is shown using likelihood ratio tests that adequate modeling of compositional heterogeneity can be achieved with few composition parameters, that the data may not need to be modelled with separate composition parameters for each branch in the tree. Tree searching and placement of composition vectors on the tree are done in a Bayesian framework using Markov chain Monte Carlo (MCMC) methods. Assessment of fit of the model to the data is made in both maximum likelihood (ML) and Bayesian frameworks. In an ML framework, overall model fit is assessed using the Goldman-Cox test, and the fit of the composition implied by a (possibly heterogeneous) model to the composition of the data is assessed using a novel tree-and model-based composition fit test. In a Bayesian framework, overall model fit and composition fit are assessed using posterior predictive simulation. It is shown that when composition is not accommodated, then the model does not fit, and incorrect trees are found; but when composition is accommodated, the model then fits, and the known correct phylogenies are obtained.
谱系间的组成异质性可能会影响系统发育分析,因为常用模型假定数据在组成上是均匀的。本文描述了能够用较少额外参数来适应组成异质性的模型,并将其应用于两个已知真实树的实例中。通过似然比检验表明,用较少的组成参数就能实现对组成异质性的充分建模,即数据可能无需为树中的每个分支分别用组成参数进行建模。使用马尔可夫链蒙特卡罗(MCMC)方法在贝叶斯框架下进行树搜索和在树上放置组成向量。在最大似然(ML)和贝叶斯框架下评估模型对数据的拟合度。在ML框架中,使用戈德曼 - 考克斯检验评估整体模型拟合度,并使用一种基于树和模型的新型组成拟合检验评估(可能是异质的)模型所隐含的组成与数据组成的拟合度。在贝叶斯框架中,使用后验预测模拟评估整体模型拟合度和组成拟合度。结果表明,当不考虑组成异质性时,模型不拟合且会找到错误的树;但当考虑组成异质性时,模型则拟合且能得到已知的正确系统发育树。