Borowiec Marek L, Zhang Y Miles, Neves Karen, Ramalho Manuela O, Fisher Brian L, Lucky Andrea, Moreau Corrie S
Colorado State University, Department of Agricultural Biology, Fort Collins, CO 80523, USA.
University of Edinburgh, Institute of Ecology and Evolution, Edinburgh, EH9 3FL, UK.
Syst Biol. 2025 Jan 8. doi: 10.1093/sysbio/syaf001.
While some relationships in phylogenomic studies have remained stable since the Sanger sequencing era, many challenging nodes remain, even with genome-scale data. Incongruence or lack of resolution in the phylogenomic era is frequently attributed to inadequate data modeling and analytical issues that lead to systematic biases. However, few studies investigate the potential for random error or establish expectations for the level of resolution achievable with a given empirical dataset and integrate uncertainties across methods when faced with conflicting results. Ants are the most species-rich lineage of social insects and one of the most ecologically important terrestrial animals. Consequently, ants have garnered significant research attention, including their systematics. Despite this, there has been no comprehensive genus-level phylogeny of the ants inferred using genomic data that thoroughly evaluates both signal strength and incongruence. In this study, we provide insight into and quantify uncertainty across the ant tree of life by utilizing the most taxonomically comprehensive Ultraconserved Elements dataset of ants to date, including 277 (81%) of recognized ant genera from all 16 extant subfamilies, and representing over 98% of described species. We use simulations to establish expectations for resolution, identify branches with less-than-expected concordance, and dissect the effects of data and model selection on recalcitrant nodes. Simulations show that hundreds of loci are needed to resolve recalcitrant nodes on our genus-level ant phylogeny. This demonstrates the continued role of random error in phylogenomic studies. Our analyses provide a comprehensive picture of support and incongruence across the ant phylogeny, while offering a more nuanced depiction of uncertainty and significantly expanding generic sampling. We use a consensus approach to integrate uncertainty across different analyses and find that assumptions about root age exert substantial influence on divergence dating. Our results suggest that advancing the understanding of ant phylogeny will require not only more data but also more refined phylogenetic models. We also provide a workflow for identifying under-supported nodes in concatenation analyses, outline a pragmatic way to reconcile conflicting results in phylogenomics, and introduce a user-friendly locus selection tool for divergence dating.
虽然自桑格测序时代以来,系统发育基因组学研究中的一些关系一直保持稳定,但即使有基因组规模的数据,仍存在许多具有挑战性的节点。系统发育基因组学时代的不一致或缺乏分辨率常常归因于数据建模不足和导致系统偏差的分析问题。然而,很少有研究调查随机误差的可能性,或确定给定经验数据集可实现的分辨率水平的预期,以及在面对相互矛盾的结果时整合不同方法的不确定性。蚂蚁是社会性昆虫中物种最丰富的类群,也是生态上最重要的陆地动物之一。因此,蚂蚁受到了大量的研究关注,包括它们的系统分类学。尽管如此,还没有使用基因组数据推断出的全面的蚂蚁属级系统发育树,该系统发育树能全面评估信号强度和不一致性。在本研究中,我们通过利用迄今为止分类学上最全面的超保守元件数据集,深入了解并量化了蚂蚁生命树中的不确定性,该数据集包括来自所有16个现存亚科的277个(81%)已确认的蚂蚁属,代表了超过98%的已描述物种。我们使用模拟来确定分辨率的预期,识别一致性低于预期的分支,并剖析数据和模型选择对顽固节点的影响。模拟表明,需要数百个基因座才能解决我们属级蚂蚁系统发育树上的顽固节点。这证明了随机误差在系统发育基因组学研究中持续存在的作用。我们的分析提供了整个蚂蚁系统发育的支持和不一致的全面图景,同时对不确定性进行了更细致入微的描述,并显著扩大了类属采样。我们使用一种共识方法来整合不同分析中的不确定性,发现关于根年龄的假设对分歧时间估计有重大影响。我们的结果表明,推进对蚂蚁系统发育的理解不仅需要更多的数据,还需要更精细的系统发育模型。我们还提供了一种在串联分析中识别支持不足节点的工作流程,概述了一种在系统发育基因组学中协调相互矛盾结果的实用方法,并引入了一种用于分歧时间估计的用户友好型基因座选择工具。