一项使用SMIDGen比较超树和联合分析方法的模拟研究。

A simulation study comparing supertree and combined analysis methods using SMIDGen.

作者信息

Swenson M Shel, Barbançon François, Warnow Tandy, Linder C Randal

机构信息

Department of Computer Sciences, The University of Texas at Austin, Austin TX, USA.

出版信息

Algorithms Mol Biol. 2010 Jan 4;5:8. doi: 10.1186/1748-7188-5-8.

DOI:10.1186/1748-7188-5-8

PMID:20047664

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837663/

Abstract

BACKGROUND

Supertree methods comprise one approach to reconstructing large molecular phylogenies given multi-marker datasets: trees are estimated on each marker and then combined into a tree (the "supertree") on the entire set of taxa. Supertrees can be constructed using various algorithmic techniques, with the most common being matrix representation with parsimony (MRP). When the data allow, the competing approach is a combined analysis (also known as a "supermatrix" or "total evidence" approach) whereby the different sequence data matrices for each of the different subsets of taxa are concatenated into a single supermatrix, and a tree is estimated on that supermatrix.

RESULTS

In this paper, we describe an extensive simulation study we performed comparing two supertree methods, MRP and weighted MRP, to combined analysis methods on large model trees. A key contribution of this study is our novel simulation methodology (Super-Method Input Data Generator, or SMIDGen) that better reflects biological processes and the practices of systematists than earlier simulations. We show that combined analysis based upon maximum likelihood outperforms MRP and weighted MRP, giving especially big improvements when the largest subtree does not contain most of the taxa.

CONCLUSIONS

This study demonstrates that MRP and weighted MRP produce distinctly less accurate trees than combined analyses for a given base method (maximum parsimony or maximum likelihood). Since there are situations in which combined analyses are not feasible, there is a clear need for better supertree methods. The source tree and combined datasets used in this study can be used to test other supertree and combined analysis methods.

摘要

背景

超级树方法是在给定多标记数据集的情况下重建大型分子系统发育树的一种方法：先在每个标记上估计树，然后将这些树合并成一个包含所有分类单元的树（“超级树”）。超级树可以使用各种算法技术构建，最常见的是简约矩阵表示法（MRP）。在数据允许的情况下，另一种竞争方法是联合分析（也称为“超级矩阵”或“总证据”方法），即将每个不同分类单元子集的不同序列数据矩阵连接成一个单一的超级矩阵，并在该超级矩阵上估计一棵树。

结果

在本文中，我们描述了一项广泛的模拟研究，我们将两种超级树方法（MRP和加权MRP）与基于大型模型树的联合分析方法进行了比较。这项研究的一个关键贡献是我们新颖的模拟方法（超级方法输入数据生成器，或SMIDGen），它比早期的模拟更好地反映了生物过程和系统学家的实践。我们表明，基于最大似然法的联合分析优于MRP和加权MRP，当最大子树不包含大多数分类单元时，改进尤为显著。

结论

这项研究表明，对于给定的基本方法（最大简约法或最大似然法），MRP和加权MRP生成的树的准确性明显低于联合分析。由于在某些情况下联合分析不可行，显然需要更好的超级树方法。本研究中使用的源树和联合数据集可用于测试其他超级树和联合分析方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3467/2837663/7d1cb11d1369/1748-7188-5-8-1.jpg

相似文献

A simulation study comparing supertree and combined analysis methods using SMIDGen.

Algorithms Mol Biol. 2010 Jan 4;5:8. doi: 10.1186/1748-7188-5-8.

Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches.

Syst Biol. 2009 Apr;58(2):240-56. doi: 10.1093/sysbio/syp021. Epub 2009 May 30.

MRL and SuperFine+MRL: new supertree methods.

Algorithms Mol Biol. 2012 Jan 26;7(1):3. doi: 10.1186/1748-7188-7-3.

Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm.

Mol Biol Evol. 2017 Sep 1;34(9):2408-2421. doi: 10.1093/molbev/msx191.

Assessment of the accuracy of matrix representation with parsimony analysis supertree construction.

Syst Biol. 2001 Aug;50(4):565-79.

Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.

Syst Biol. 2011 Jan;60(1):32-44. doi: 10.1093/sysbio/syq057. Epub 2010 Nov 10.

Performance of flip supertree construction with a heuristic algorithm.

Syst Biol. 2004 Apr;53(2):299-308. doi: 10.1080/10635150490423719.

Improved heuristics for minimum-flip supertree construction.

Evol Bioinform Online. 2007 Feb 28;2:347-56.

SuperFine: fast and accurate supertree estimation.

Syst Biol. 2012 Mar;61(2):214-27. doi: 10.1093/sysbio/syr092. Epub 2011 Sep 20.

Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees.

Syst Biol. 2003 Dec;52(6):839-48.

引用本文的文献

Spectral cluster supertree: fast and statistically robust merging of rooted phylogenetic trees.

Front Mol Biosci. 2024 Oct 30;11:1432495. doi: 10.3389/fmolb.2024.1432495. eCollection 2024.

GPTree Cluster: phylogenetic tree cluster generator in the context of supertree inference.

Bioinform Adv. 2023 Mar 3;3(1):vbad023. doi: 10.1093/bioadv/vbad023. eCollection 2023.

BCD Beam Search: considering suboptimal partial solutions in Bad Clade Deletion supertrees.

PeerJ. 2018 Jun 8;6:e4987. doi: 10.7717/peerj.4987. eCollection 2018.

The performance of coalescent-based species tree estimation methods under models of missing data.

BMC Genomics. 2018 May 8;19(Suppl 5):286. doi: 10.1186/s12864-018-4619-8.

SIESTA: enhancing searches for optimal supertrees and species trees.

BMC Genomics. 2018 May 8;19(Suppl 5):252. doi: 10.1186/s12864-018-4621-1.

Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm.

Mol Biol Evol. 2017 Sep 1;34(9):2408-2421. doi: 10.1093/molbev/msx191.

FastRFS: fast and accurate Robinson-Foulds Supertrees using constrained exact optimization.

Bioinformatics. 2017 Mar 1;33(5):631-639. doi: 10.1093/bioinformatics/btw600.

Collecting reliable clades using the Greedy Strict Consensus Merger.

PeerJ. 2016 Jun 28;4:e2172. doi: 10.7717/peerj.2172. eCollection 2016.

Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance.

Algorithms Mol Biol. 2013 Nov 1;8(1):28. doi: 10.1186/1748-7188-8-28.

Conventional simulation of biological sequences leads to a biased assessment of multi-Loci phylogenetic analysis.

Evol Bioinform Online. 2013 Aug 13;9:317-25. doi: 10.4137/EBO.S12483. eCollection 2013.

本文引用的文献

Phylogenetic supertrees: Assembling the trees of life.

Trends Ecol Evol. 1998 Mar;13(3):105-9. doi: 10.1016/S0169-5347(97)01242-1.

Improved heuristics for minimum-flip supertree construction.

Evol Bioinform Online. 2007 Feb 28;2:347-56.

Broad phylogenomic sampling improves resolution of the animal tree of life.

Nature. 2008 Apr 10;452(7188):745-9. doi: 10.1038/nature06614. Epub 2008 Mar 5.

The delayed rise of present-day mammals.

Nature. 2007 Mar 29;446(7135):507-12. doi: 10.1038/nature05634.

A higher-level MRP supertree of placental mammals.

BMC Evol Biol. 2006 Nov 13;6:93. doi: 10.1186/1471-2148-6-93.

SDM: a fast distance-based approach for (super) tree building in phylogenomics.

Syst Biol. 2006 Oct;55(5):740-55. doi: 10.1080/10635150600969872.

RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Bioinformatics. 2006 Nov 1;22(21):2688-90. doi: 10.1093/bioinformatics/btl446. Epub 2006 Aug 23.

The evolution of supertrees.

Trends Ecol Evol. 2004 Jun;19(6):315-22. doi: 10.1016/j.tree.2004.03.015.

A complete phylogeny of the whales, dolphins and even-toed hoofed mammals (Cetartiodactyla).

Biol Rev Camb Philos Soc. 2005 Aug;80(3):445-73. doi: 10.1017/s1464793105006743.

Bayesian inference of the metazoan phylogeny; a combined molecular and morphological approach.

Curr Biol. 2004 Sep 21;14(18):1644-9. doi: 10.1016/j.cub.2004.09.027.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一项使用SMIDGen比较超树和联合分析方法的模拟研究。

A simulation study comparing supertree and combined analysis methods using SMIDGen.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献