Desper Richard, Gascuel Olivier
Department of Biology, University College, London, United Kingdom.
Curr Protoc Bioinformatics. 2006 Oct;Chapter 6:Unit 6.3. doi: 10.1002/0471250953.bi0603s15.
Neighbor Joining (NJ), FastME, and other distance-based programs including BIONJ, WEIGHBOR, and (to some extent) FITCH, are fast methods to build phylogenetic trees. This makes them particularly effective for large-scale studies or for bootstrap analysis, which require runs on multiple data sets. Like maximum likelihood methods, distance methods are based on a sequence evolution model that is used to estimate the matrix of pairwise evolutionary distances. Computer simulations indicate that the topological accuracy of FastME is best, followed by FITCH, WEIGHBOR, and BIONJ, while NJ is worse. Moreover, FastME is even faster than NJ with large data sets. Best-distance methods are equivalent to parsimony in most cases, but become more accurate when the molecular clock is strongly violated or in the presence of long (e.g., outgroup) branches. This unit describes how to use distance-based methods, focusing on NJ (the most popular) and FastME (the most efficient today). It also describes how to estimate evolutionary distances from DNA and proteins, how to perform bootstrap analysis, and how to use CLUSTAL to compute both a sequence alignment and a phylogenetic tree.
邻接法(NJ)、FastME以及其他基于距离的程序,包括BIONJ、WEIGHBOR以及(在某种程度上)FITCH,都是构建系统发育树的快速方法。这使得它们对于大规模研究或自展分析特别有效,这些研究需要在多个数据集上运行。与最大似然法一样,距离法基于一个序列进化模型,该模型用于估计成对进化距离的矩阵。计算机模拟表明,FastME的拓扑准确性最佳,其次是FITCH、WEIGHBOR和BIONJ,而NJ较差。此外,在处理大数据集时,FastME甚至比NJ更快。最佳距离法在大多数情况下等同于简约法,但当分子钟被严重违反或存在长分支(如外类群分支)时,其准确性会更高。本单元描述了如何使用基于距离的方法,重点介绍NJ(最流行的)和FastME(目前最有效的)。它还描述了如何从DNA和蛋白质中估计进化距离,如何进行自展分析,以及如何使用CLUSTAL来计算序列比对和系统发育树。