Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
Department of Biology, Geology and Environmental Science, University of Tennessee at Chattanooga, Chattanooga, TN 37403, USA.
Syst Biol. 2019 Jul 1;68(4):573-593. doi: 10.1093/sysbio/syy085.
Resolving patterns of ancient and rapid diversifications is one of the most challenging tasks in evolutionary biology. These difficulties arise from confusing phylogenetic signals that are associated with the interplay of incomplete lineage sorting (ILS) and homoplasy. Phylogenomic analyses of hundreds, or even thousands, of loci offer the potential to resolve such contentious relationships. Yet, how much useful phylogenetic information these large data sets contain remains uncertain and often goes untested. Here, we assess the utility of different data filtering approaches to maximize phylogenetic information and minimize noise when reconstructing an ancient radiation of Neotropical electric knifefishes (Order Gymnotiformes) using ultraconserved elements. We found two contrasting hypotheses of gymnotiform evolutionary relationships depending on whether phylogenetic inferences were based on concatenation or coalescent methods. In the first case, all analyses inferred a previously-and commonly-proposed hypothesis, where the family Apteronotidae was found as the sister group to all other gymnotiform families. In contrast, coalescent-based analyses suggested a novel hypothesis where families producing pulse-type (viz., Gymnotidae, Hypopomidae, and Rhamphichthyidae) and wave-type electric signals (viz., Apteronotidae, Sternopygidae) were reciprocally monophyletic. Nodal support for this second hypothesis increased when analyzing loci with the highest phylogenetic information content and further increased when data were pruned using targeted filtering methods that maximized phylogenetic informativeness at the deepest nodes of the Gymnotiformes. Bayesian concordance analyses and topology tests of individual gene genealogies demonstrated that the difficulty of resolving this radiation was likely due to high gene-tree incongruences that resulted from ILS. We show that data filtering reduces gene-tree heterogeneity and increases nodal support and consistency of species trees using coalescent methods; however, we failed to observe the same effect when using concatenation methods. Furthermore, the targeted filtering strategies applied here support the use of "gene data interrogation" rather than "gene genealogy interrogation" approaches in phylogenomic analyses, to extract phylogenetic signal from intractable portions of the Tree of Life.
解析古老而快速的多样化模式是进化生物学中最具挑战性的任务之一。这些困难源于混淆了与不完全谱系分选(ILS)和同功现象相互作用相关的系统发育信号。数百个,甚至数千个基因座的系统基因组分析提供了解决此类有争议关系的潜力。然而,这些大数据集包含多少有用的系统发育信息仍然不确定,而且通常未经测试。在这里,我们评估了不同的数据过滤方法的效用,以最大限度地提高系统发育信息并最小化噪音,从而使用超保守元件重建新热带电鳗(Gymnotiformes 目)的古老辐射。我们发现了两种截然不同的电鳗进化关系假设,这取决于系统发育推断是基于串联还是合并方法。在第一种情况下,所有分析都推断出了以前和通常提出的假设,即 Apteronotidae 科被发现为所有其他电鳗科的姐妹群。相比之下,基于合并的分析表明了一个新的假设,即产生脉冲型(即 Gymnotidae、Hypopomidae 和 Rhamphichthyidae)和波型电信号的科(即 Apteronotidae、Sternopygidae)是相互单系的。当分析具有最高系统发育信息量的基因座时,第二个假设的节点支持增加,并且当使用最大程度地提高 Gymnotiformes 最深节点处系统发育信息量的靶向过滤方法修剪数据时,节点支持进一步增加。贝叶斯一致性分析和单个基因谱系的拓扑测试表明,解决这种辐射的困难可能是由于 ILS 导致的基因树不一致。我们表明,数据过滤减少了基因树的异质性,并使用合并方法增加了节点支持和物种树的一致性;然而,当使用串联方法时,我们没有观察到相同的效果。此外,这里应用的靶向过滤策略支持在系统基因组分析中使用“基因数据询问”而不是“基因谱系询问”方法,以从生命之树的棘手部分提取系统发育信号。