Institut de Biologie Computationnelle, LIRMM, UMR 5506 CNRS - Univ. Montpellier 2, Case courrier 06011, 95 rue de la Galéra, 34095 Montpellier, France; Allan Wilson Centre, University of Canterbury, Ilam Road 8041, Christchurch, New Zealand.
Syst Biol. 2014 May;63(3):421-35. doi: 10.1093/sysbio/syu010. Epub 2014 Feb 21.
Predicting the ancestral sequences of a group of homologous sequences related by a phylogenetic tree has been the subject of many studies, and numerous methods have been proposed for this purpose. Theoretical results are available that show that when the substitution rates become too large, reconstructing the ancestral state at the tree root is no longer feasible. Here, we also study the reconstruction of the ancestral changes that occurred along the tree edges. We show that, that, depending on the tree and branch length distribution, reconstructing these changes (i.e., reconstructing the ancestral state of all internal nodes in the tree) may be easier or harder than reconstructing the ancestral root state. However, results from information theory indicate that for the standard Yule tree, the task of reconstructing internal node states remains feasible, even for very high substitution rates. Moreover, computer simulations demonstrate that for more complex trees and scenarios, this result still holds. For a large variety of counting, parsimony- and likelihood-based methods, the predictive accuracy of a randomly selected internal node in the tree is indeed much higher than the accuracy of the same method when applied to the tree root. Moreover, parsimony- and likelihood-based methods appear to be remarkably robust to sampling bias and model mis-specification.
预测通过系统发育树相关的一组同源序列的祖先序列一直是许多研究的主题,并且已经提出了许多用于此目的的方法。理论结果表明,当取代率变得太大时,在树的根部重建祖先状态变得不再可行。在这里,我们还研究了沿着树边发生的祖先变化的重建。我们表明,这取决于树和分支长度分布,重建这些变化(即重建树中所有内部节点的祖先状态)可能比重建祖先根状态更容易或更难。但是,信息理论的结果表明,对于标准的 Yule 树,重建内部节点状态的任务仍然是可行的,即使对于非常高的取代率也是如此。此外,计算机模拟表明,对于更复杂的树和场景,这一结果仍然成立。对于各种计数,简约和似然为基础的方法,树中随机选择的内部节点的预测准确性确实比该方法应用于树的根部时的准确性高得多。此外,简约和似然为基础的方法似乎对采样偏差和模型误指定具有显着的稳健性。