基于超保守元件的基因树估计误差：对拟蜜蜂属的实证研究。

Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.

机构信息

Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA.

Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.

出版信息

Syst Biol. 2021 Jun 16;70(4):803-821. doi: 10.1093/sysbio/syaa097.

DOI:10.1093/sysbio/syaa097

PMID:33367855

Abstract

Summarizing individual gene trees to species phylogenies using two-step coalescent methods is now a standard strategy in the field of phylogenomics. However, practical implementations of summary methods suffer from gene tree estimation error, which is caused by various biological and analytical factors. Greatly understudied is the choice of gene tree inference method and downstream effects on species tree estimation for empirical data sets. To better understand the impact of this method choice on gene and species tree accuracy, we compare gene trees estimated through four widely used programs under different model-selection criteria: PhyloBayes, MrBayes, IQ-Tree, and RAxML. We study their performance in the phylogenomic framework of $>$800 ultraconserved elements from the bee subfamily Nomiinae (Halictidae). Our taxon sampling focuses on the genus Pseudapis, a distinct lineage with diverse morphological features, but contentious morphology-based taxonomic classifications and no molecular phylogenetic guidance. We approximate topological accuracy of gene trees by assessing their ability to recover two uncontroversial, monophyletic groups, and compare branch lengths of individual trees using the stemminess metric (the relative length of internal branches). We further examine different strategies of removing uninformative loci and the collapsing of weakly supported nodes into polytomies. We then summarize gene trees with ASTRAL and compare resulting species phylogenies, including comparisons to concatenation-based estimates. Gene trees obtained with the reversible jump model search in MrBayes were most concordant on average and all Bayesian methods yielded gene trees with better stemminess values. The only gene tree estimation approach whose ASTRAL summary trees consistently produced the most likely correct topology, however, was IQ-Tree with automated model designation (ModelFinder program). We discuss these findings and provide practical advice on gene tree estimation for summary methods. Lastly, we establish the first phylogeny-informed classification for Pseudapis s. l. and map the distribution of distinct morphological features of the group. [ASTRAL; Bees; concordance; gene tree estimation error; IQ-Tree; MrBayes, Nomiinae; PhyloBayes; RAxML; phylogenomics; stemminess].

摘要

使用两步合并方法将个体基因树总结为物种系统发育树，现在是系统基因组学领域的标准策略。然而，总结方法的实际实施受到基因树估计误差的影响，这种误差是由各种生物和分析因素引起的。对于经验数据集，基因树推断方法的选择及其对物种树估计的下游影响，研究得还很不够。为了更好地理解这种方法选择对基因和物种树准确性的影响，我们比较了在不同模型选择标准下，四个广泛使用的程序估计的基因树：PhyloBayes、MrBayes、IQ-Tree 和 RAxML。我们在蜜蜂亚科 Nominae（Halictidae）的 800 多个超保守元件的系统基因组学框架中研究了它们的性能。我们的分类群采样集中在 Pseudapis 属上，这是一个具有不同形态特征的独特谱系，但基于形态的分类分类存在争议，也没有分子系统发育指导。我们通过评估它们恢复两个无争议的单系群的能力来近似基因树的拓扑准确性，并使用茎性度量（内部分支的相对长度）比较个体树的分支长度。我们进一步研究了去除无信息基因座和将弱支持节点折叠为并系的不同策略。然后，我们使用 ASTRAL 对基因树进行总结，并比较得出的物种系统发育，包括与基于串联的估计的比较。MrBayes 中使用可逆跳跃模型搜索获得的基因树平均最一致，所有贝叶斯方法都产生了具有更好茎性值的基因树。然而，唯一一种 ASTRAL 汇总树始终产生最可能正确拓扑的基因树估计方法是 IQ-Tree 与自动模型指定（ModelFinder 程序）。我们讨论了这些发现，并就汇总方法的基因树估计提供了实用建议。最后，我们为 Pseudapis s. l. 建立了第一个基于系统发育的分类，并映射了该组独特形态特征的分布。[ASTRAL；蜜蜂；一致性；基因树估计误差；IQ-Tree；MrBayes，Nomiinae；PhyloBayes；RAxML；系统基因组学；茎性]。

相似文献

Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.

Syst Biol. 2021 Jun 16;70(4):803-821. doi: 10.1093/sysbio/syaa097.

Collapsing dubiously resolved gene-tree branches in phylogenomic coalescent analyses.

Mol Phylogenet Evol. 2021 May;158:107092. doi: 10.1016/j.ympev.2021.107092. Epub 2021 Feb 2.

The gene tree delusion.

Mol Phylogenet Evol. 2016 Jan;94(Pt A):1-33. doi: 10.1016/j.ympev.2015.07.018. Epub 2015 Jul 31.

To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods.

Syst Biol. 2018 Mar 1;67(2):285-303. doi: 10.1093/sysbio/syx077.

Theoretical and Practical Considerations when using Retroelement Insertions to Estimate Species Trees in the Anomaly Zone.

Syst Biol. 2022 Apr 19;71(3):721-740. doi: 10.1093/sysbio/syab086.

Comparing species tree estimation with large anchored phylogenomic and small Sanger-sequenced molecular datasets: an empirical study on Malagasy pseudoxyrhophiine snakes.

BMC Evol Biol. 2015 Oct 12;15:221. doi: 10.1186/s12862-015-0503-1.

Divergence and support among slightly suboptimal likelihood gene trees.

Cladistics. 2020 Jun;36(3):322-340. doi: 10.1111/cla.12404. Epub 2019 Nov 13.

Larger, unfiltered datasets are more effective at resolving phylogenetic conflict: Introns, exons, and UCEs resolve ambiguities in Golden-backed frogs (Anura: Ranidae; genus Hylarana).

Mol Phylogenet Evol. 2020 Oct;151:106899. doi: 10.1016/j.ympev.2020.106899. Epub 2020 Jun 24.

Accounting for Uncertainty in Gene Tree Estimation: Summary-Coalescent Species Tree Inference in a Challenging Radiation of Australian Lizards.

Syst Biol. 2017 May 1;66(3):352-366. doi: 10.1093/sysbio/syw089.

The impact of GC bias on phylogenetic accuracy using targeted enrichment phylogenomic data.

Mol Phylogenet Evol. 2017 Jun;111:149-157. doi: 10.1016/j.ympev.2017.03.022. Epub 2017 Apr 5.

引用本文的文献

Concatenation fails to describe the anomalous radiation of giant cockroaches (Blattodea: Blaberidae) despite moderate to low discordance.

BMC Ecol Evol. 2025 Jul 21;25(1):72. doi: 10.1186/s12862-025-02409-4.

Extensive genome-wide phylogenetic discordance is due to incomplete lineage sorting in the rapidly radiated East Asian genus Nekemias (Vitaceae).

Ann Bot. 2025 May 9;135(5):925-934. doi: 10.1093/aob/mcae224.

Neglected no longer: Phylogenomic resolution of higher-level relationships in Solifugae.

iScience. 2023 Aug 19;26(9):107684. doi: 10.1016/j.isci.2023.107684. eCollection 2023 Sep 15.

Filtration of Gene Trees From 9,000 Exons, Introns, and UCEs Disentangles Conflicting Phylogenomic Relationships in Tree Frogs (Hylidae).

Genome Biol Evol. 2023 May 5;15(5). doi: 10.1093/gbe/evad070.

Weighting by Gene Tree Uncertainty Improves Accuracy of Quartet-based Species Trees.

Mol Biol Evol. 2022 Dec 5;39(12). doi: 10.1093/molbev/msac215.

QuCo: quartet-based co-estimation of species trees and gene trees.

Bioinformatics. 2022 Jun 24;38(Suppl 1):i413-i421. doi: 10.1093/bioinformatics/btac265.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于超保守元件的基因树估计误差：对拟蜜蜂属的实证研究。

Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.

机构信息

Department of Entomology, Cornell University, Comstock Hall, Ithaca, NY 14853, USA.

Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA.

出版信息

Syst Biol. 2021 Jun 16;70(4):803-821. doi: 10.1093/sysbio/syaa097.

DOI:10.1093/sysbio/syaa097

PMID:33367855

Abstract

摘要

基于超保守元件的基因树估计误差：对拟蜜蜂属的实证研究。

Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于超保守元件的基因树估计误差：对拟蜜蜂属的实证研究。

Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.

机构信息

出版信息

相似文献

引用本文的文献