Ranwez Vincent, Gascuel Olivier
Département Informatique Fondamentale et Applications, LIRMM, Montpellier Cedex 5, France.
Mol Biol Evol. 2002 Nov;19(11):1952-63. doi: 10.1093/oxfordjournals.molbev.a004019.
We introduce a new approach to estimate the evolutionary distance between two sequences. This approach uses a tree with three leaves: two of them correspond to the studied sequences, whereas the third is chosen to handle long-distance estimation. The branch lengths of this tree are obtained by likelihood maximization and are then used to deduce the desired distance. This approach, called TripleML, improves the precision of evolutionary distance estimates, and thus the topological accuracy of distance-based methods. TripleML can be used with neighbor-joining-like (NJ-like) methods not only to compute the initial distance matrix but also to estimate new distances encountered during the agglomeration process. Computer simulations indicate that using TripleML significantly improves the topological accuracy of NJ, BioNJ, and Weighbor, while conserving a reasonable computation time. With randomly generated 24-taxon trees and realistic parameter values, combining NJ with TripleML reduces the number of wrongly inferred branches by about 11% (against 2.6% and 5.5% for BioNJ and Weighbor, respectively). Moreover, this combination requires only about 1.5 min to infer a phylogeny of 96 sequences composed of 1,200 nucleotides, as compared with 6.5 h for FastDNAml on the same machine (PC 466 MHz).
我们介绍了一种估计两个序列之间进化距离的新方法。这种方法使用一棵有三个叶节点的树:其中两个叶节点对应于所研究的序列,而第三个叶节点则用于处理远距离估计。这棵树的分支长度通过似然最大化获得,然后用于推导所需的距离。这种方法称为TripleML,提高了进化距离估计的精度,从而提高了基于距离方法的拓扑准确性。TripleML不仅可以与类邻接法(NJ-like)方法一起用于计算初始距离矩阵,还可以用于估计在聚类过程中遇到的新距离。计算机模拟表明,使用TripleML可以显著提高NJ、BioNJ和Weighbor的拓扑准确性,同时保持合理的计算时间。对于随机生成的24分类单元树和实际参数值,将NJ与TripleML相结合可将错误推断分支的数量减少约11%(BioNJ和Weighbor分别为2.6%和5.5%)。此外,与同一台机器(PC 466 MHz)上FastDNAml推断96个由1200个核苷酸组成的序列的系统发育需要6.5小时相比,这种组合仅需约1.5分钟。