Department of Statistics, University of Georgia, Athens, Georgia.
Institute of Bioinformatics, University of Georgia, Athens, Georgia.
Ann N Y Acad Sci. 2015 Dec;1360:36-53. doi: 10.1111/nyas.12747. Epub 2015 Apr 14.
The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as "species tree" methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data.
不同生物基因组中信号的异质性给传统的系统发育分析带来了挑战。已经提出了被称为“种系树”方法的系统发育方法,以直接解决基因树异质性的一个重要来源,即当进化谱系快速辐射时发生的不完全谱系分选,导致从单个基础种系树产生多种基因树。在这里,我们回顾了有助于澄清种系树和串联方法之间冲突的理论和实证例子,以及文献中关于种系树方法性能的误解。将串联视为多物种合并模型的一个特例有助于解释两种方法在基因组数据集上行为的差异。最近的研究表明,与串联方法相比,种系树方法在一些经典的系统发育分析挑战方面更具稳健性,包括 DNA 序列中快速进化的位点和长枝吸引。我们表明,旨在增强种系树分析信号的方法,例如分箱,可能会扭曲基因树的分布,并且不一致。结合生物学现实的计算效率高的种系树方法是全基因组数据系统发育分析的关键。