• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用肥皂草科(无患子科)作为案例研究,比较大数据集中超级树算法的性能。

Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.

机构信息

Real Jardin Botanico, Department of Biodiversity and Conservation, CSIC, Plaza de Murillo 2, 28014 Madrid, Spain.

出版信息

Syst Biol. 2011 Jan;60(1):32-44. doi: 10.1093/sysbio/syq057. Epub 2010 Nov 10.

DOI:10.1093/sysbio/syq057
PMID:21068445
Abstract

For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.

摘要

在过去的 20 年中,超级树重建一直是一个活跃的研究领域,并且已经开发出了许多主要的算法。由于超级树方法越来越受欢迎,因此有必要评估这些算法的性能,以确定哪些是最佳选择(尤其是在广泛使用的超级矩阵方法方面)。在这项研究中,使用来自全球开花植物科 Sapindaceae 的大量经验数据集(就分类单元和分子标记的数量而言)研究了七种最常用的超级树方法。超级树方法使用几个标准进行评估:超级树与输入树的相似性,超级树与总证据树的相似性,超级树的分辨率水平以及算法所需的计算时间。还对简化数据集进行了其他分析,以测试性能水平是否受到启发式搜索的影响,而不是算法本身。根据我们的结果,确定了两类主要的超级树方法:一方面,基于矩阵表示的简约法(MRP)、MinFlip 和 MinCut 方法根据我们的标准表现良好,而平均共识、分裂拟合和最相似超级树方法的性能较差,或者至少与总证据树的行为方式不同。在这里测试的最新方法超级距离矩阵的结果很有希望,至少有一种衍生方法的表现与 MRP、MinFlip 和 MinCut 一样好。当应用于简化数据集时,每种方法的输出仅略有改善,这表明启发式搜索的行为正确,并且算法对数据集大小和缺失数据的敏感性相对较低。结果还表明,即使使用简单的启发式搜索策略,MRP 分析也可以达到很高的质量水平,除了使用 Purvis 编码方案和可逆简约法的 MRP 外。超级树的未来在于为所有方法实现标准化的启发式搜索,并增加计算能力以处理大型数据集。对于有前途的方法(例如最大四分体拟合方法)而言,这将特别有用,因为该方法仍需要大量的计算能力。

相似文献

1
Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.利用肥皂草科(无患子科)作为案例研究,比较大数据集中超级树算法的性能。
Syst Biol. 2011 Jan;60(1):32-44. doi: 10.1093/sysbio/syq057. Epub 2010 Nov 10.
2
Performance of flip supertree construction with a heuristic algorithm.使用启发式算法进行翻转超树构建的性能
Syst Biol. 2004 Apr;53(2):299-308. doi: 10.1080/10635150490423719.
3
The shape of supertrees to come: tree shape related properties of fourteen supertree methods.未来超级树的形态:十四种超级树方法中与树形相关的属性
Syst Biol. 2005 Jun;54(3):419-31. doi: 10.1080/10635150590949832.
4
Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches.全面进行棕榈科(Arecaceae)的类群水平系统发育分析,并比较了超级树和超级矩阵方法。
Syst Biol. 2009 Apr;58(2):240-56. doi: 10.1093/sysbio/syp021. Epub 2009 May 30.
5
Novel versus unsupported clades: assessing the qualitative support for clades in MRP supertrees.新分支与无支持分支:评估MRP超级树中分支的定性支持度
Syst Biol. 2003 Dec;52(6):839-48.
6
Assessment of the accuracy of matrix representation with parsimony analysis supertree construction.基于简约分析超树构建法评估矩阵表示的准确性
Syst Biol. 2001 Aug;50(4):565-79.
7
Imputing supertrees and supernetworks from quartets.从四重奏中推算超级树和超级网络。
Syst Biol. 2007 Feb;56(1):57-67. doi: 10.1080/10635150601167013.
8
SuperFine: fast and accurate supertree estimation.SuperFine:快速准确的超级树估计。
Syst Biol. 2012 Mar;61(2):214-27. doi: 10.1093/sysbio/syr092. Epub 2011 Sep 20.
9
SDM: a fast distance-based approach for (super) tree building in phylogenomics.SDM:一种用于系统发育基因组学中(超)树构建的基于距离的快速方法。
Syst Biol. 2006 Oct;55(5):740-55. doi: 10.1080/10635150600969872.
10
Building supertrees: an empirical assessment using the grass family (Poaceae).构建超级树:基于禾本科的实证评估
Syst Biol. 2002 Feb;51(1):136-50. doi: 10.1080/106351502753475916.

引用本文的文献

1
An updated infra-familial classification of Sapindaceae based on targeted enrichment data.基于靶向富集数据的无患子科的更新亚科分类。
Am J Bot. 2021 Jul;108(7):1234-1251. doi: 10.1002/ajb2.1693. Epub 2021 Jul 5.
2
Colony size predicts division of labour in attine ants.蚁群大小预示着切叶蚁的劳动分工。
Proc Biol Sci. 2014 Oct 22;281(1793). doi: 10.1098/rspb.2014.1411.
3
Reconciliation of gene and species trees.基因树和种系发生树的整合。
Biomed Res Int. 2014;2014:642089. doi: 10.1155/2014/642089. Epub 2014 Mar 27.
4
Building megaphylogenies for macroecology: taking up the challenge.构建用于宏观生态学的巨型系统发育树:迎接挑战。
Ecography. 2013 Jan 1;36(1):13-26. doi: 10.1111/j.1600-0587.2012.07773.x.
5
An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines.基于北美的松树的多基因数据集对两阶段种系树推断策略的实证评估。
BMC Evol Biol. 2014 Mar 29;14:67. doi: 10.1186/1471-2148-14-67.
6
Ancient origins of vertebrate-specific innate antiviral immunity.脊椎动物特异性先天性抗病毒免疫的古老起源。
Mol Biol Evol. 2014 Jan;31(1):140-53. doi: 10.1093/molbev/mst184. Epub 2013 Oct 8.
7
The abrupt climate change at the Eocene-Oligocene boundary and the emergence of South-East Asia triggered the spread of sapindaceous lineages.始新世-渐新世之交的剧烈气候变化和东南亚的出现引发了 sapindaceous 谱系的扩散。
Ann Bot. 2013 Jul;112(1):151-60. doi: 10.1093/aob/mct106. Epub 2013 May 30.
8
Comparative transcriptomics of early dipteran development.早期双翅目昆虫发育的比较转录组学。
BMC Genomics. 2013 Feb 24;14:123. doi: 10.1186/1471-2164-14-123.
9
Computing diversity from dated phylogenies and taxonomic hierarchies: does it make a difference to the conclusions?基于过时的系统发育树和分类阶元计算多样性:这会对结论产生影响吗?
Oecologia. 2012 Oct;170(2):501-6. doi: 10.1007/s00442-012-2318-8. Epub 2012 Apr 17.
10
Polynomial supertree methods revisited.多项式超树方法再探讨。
Adv Bioinformatics. 2011;2011:524182. doi: 10.1155/2011/524182. Epub 2011 Dec 21.