Program in Organismic and Evolutionary Biology, University of Massachusetts, 611 North Pleasant Street, Amherst, MA 01003, USA.
Syst Biol. 2010 Oct;59(5):518-33. doi: 10.1093/sysbio/syq037. Epub 2010 Jul 23.
An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying the diversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diversity into a small number of high-level "supergroups," many of which receive strong support in phylogenomic analyses. However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships due to systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in these genomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-rich strategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxa representing 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life. The consistency across analyses with varying numbers of taxa (88-451) and levels of missing data (17-69%) supports the accuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genes or taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in these analyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup "Chromalveolata" is rejected. Furthermore, extensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic gene transfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionary relationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analyses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support (BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data.
真核生物进化树的精确重建对于确定微生物和宏观生物(如植物和动物)真核生物多样性的创新至关重要。先前的工作将真核生物多样性分为少数几个高级“超群”,其中许多在系统发育基因组分析中得到了强有力的支持。然而,系统发育基因组分析中数据的丰富性可能会导致高度支持但错误的关系,因为存在系统发育错误。此外,这些基因组研究中包含的主要真核生物谱系(19 个或更少)的数量较少,可能会夸大系统错误并降低评估假设的能力。在这里,我们使用分类丰富的策略来评估真核生物的关系。我们表明,强调广泛分类采样(多达 451 个代表 72 个主要谱系的分类单元)与中等数量基因的分析产生了一个解决良好的真核生物生命树。在具有不同数量分类单元(88-451)和缺失数据水平(17-69%)的分析中,一致性支持了所得拓扑结构的准确性。没有去除快速进化的基因或分类单元,就可以出现稳定的拓扑结构,这是系统发育基因组分析中的常见做法。在这些分析中,几个主要群体是稳定的,并且得到了强烈的支持(例如 SAR、Rhizaria、Excavata),而提出的超群“Chromalveolata”则被拒绝。此外,光合作用谱系之间的广泛不稳定性表明存在系统偏差,包括共生体(核或质体)向宿主的内共生基因转移。我们的分析表明,通过广泛的分类采样和中等数量的基因可以实现古老进化关系的稳定拓扑结构。最后,像本文所提出的分类丰富的分析为测试在系统发育基因组分析中获得高自举支持(BS)的关系的准确性提供了一种方法,并能够确定缺乏基因组规模数据的众多谱系的位置。