一种用于汇总数百万物种的系统发育和分类信息的超级树管道。

A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species.

作者信息

Redelings Benjamin D, Holder Mark T

机构信息

Department of Biology, Duke University, Durham, NC, United States; Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, United States.

Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, United States; Biodiversity Institute, University of Kansas, Lawrence, KS, United States; Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.

出版信息

PeerJ. 2017 Mar 1;5:e3058. doi: 10.7717/peerj.3058. eCollection 2017.

DOI:10.7717/peerj.3058

PMID:28265520

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5335690/

Abstract

We present a new supertree method that enables rapid estimation of a summary tree on the scale of millions of leaves. This supertree method summarizes a collection of input phylogenies and an input taxonomy. We introduce formal goals and criteria for such a supertree to satisfy in order to transparently and justifiably represent the input trees. In addition to producing a supertree, our method computes annotations that describe which grouping in the input trees support and conflict with each group in the supertree. We compare our supertree construction method to a previously published supertree construction method by assessing their performance on input trees used to construct the Open Tree of Life version 4, and find that our method increases the number of displayed input splits from 35,518 to 39,639 and decreases the number of conflicting input splits from 2,760 to 1,357. The new supertree method also improves on the previous supertree construction method in that it produces no unsupported branches and avoids unnecessary polytomies. This pipeline is currently used by the Open Tree of Life project to produce all of the versions of project's "synthetic tree" starting at version 5. This software pipeline is called "". It relies heavily on ""-a set of C++ tools to perform most of the steps of the pipeline. All of the components are free software and are available on GitHub.

摘要

我们提出了一种新的超树方法，该方法能够在数百万个叶子的规模上快速估计一棵总结树。这种超树方法总结了一组输入系统发育树和一个输入分类法。我们引入了此类超树要满足的正式目标和标准，以便透明且合理地表示输入树。除了生成一棵超树外，我们的方法还计算注释，描述输入树中的哪些分组支持和与超树中的每个组冲突。我们通过在用于构建生命之树第4版的输入树上评估其性能，将我们的超树构建方法与先前发表的超树构建方法进行比较，发现我们的方法将显示的输入分裂数量从35,518增加到39,639，并将冲突的输入分裂数量从2,760减少到1,357。新的超树方法还在先前的超树构建方法上有所改进，即它不会产生无支持的分支并避免不必要的多歧分支。生命之树项目目前使用这个流程来生成从第5版开始的项目“综合树”的所有版本。这个软件流程被称为“”。它严重依赖于“”——一组C++工具来执行流程的大部分步骤。所有组件都是自由软件，可在GitHub上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd5/5335690/250501883449/peerj-05-3058-g001.jpg

相似文献

A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species.

PeerJ. 2017 Mar 1;5:e3058. doi: 10.7717/peerj.3058. eCollection 2017.

Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches.

Syst Biol. 2009 Apr;58(2):240-56. doi: 10.1093/sysbio/syp021. Epub 2009 May 30.

PhySIC: a veto supertree method with desirable properties.

Syst Biol. 2007 Oct;56(5):798-817. doi: 10.1080/10635150701639754.

Triplet supertree heuristics for the tree of life.

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2105-10-S1-S8.

The Performance of Two Supertree Schemes Compared Using Synthetic and Real Data Quartet Input.

J Mol Evol. 2018 Feb;86(2):150-165. doi: 10.1007/s00239-018-9833-0. Epub 2018 Feb 19.

The Supertree Toolkit 2: a new and improved software package with a Graphical User Interface for supertree construction.

Biodivers Data J. 2014 Mar 26(2):e1053. doi: 10.3897/BDJ.2.e1053. eCollection 2014.

Gene Tree Construction and Correction Using SuperTree and Reconciliation.

IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1560-1570. doi: 10.1109/TCBB.2017.2720581. Epub 2017 Jun 27.

Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm.

Mol Biol Evol. 2017 Sep 1;34(9):2408-2421. doi: 10.1093/molbev/msx191.

Speeding up iterative applications of the BUILD supertree algorithm.

PeerJ. 2024 Jan 2;12:e16624. doi: 10.7717/peerj.16624. eCollection 2024.

Performance of flip supertree construction with a heuristic algorithm.

Syst Biol. 2004 Apr;53(2):299-308. doi: 10.1080/10635150490423719.

引用本文的文献

Plants with higher dispersal capabilities follow 'abundant-centre' distributions but such patterns remain rare in animals.

Nat Commun. 2025 Sep 2;16(1):8205. doi: 10.1038/s41467-025-63566-0.

Synthesizing decades of research into one tree for birds.

Proc Natl Acad Sci U S A. 2025 Jun 3;122(22):e2507805122. doi: 10.1073/pnas.2507805122. Epub 2025 May 27.

A complete and dynamic tree of birds.

Proc Natl Acad Sci U S A. 2025 May 6;122(18):e2409658122. doi: 10.1073/pnas.2409658122. Epub 2025 Apr 29.

Is variation in female aggressiveness across species associated with reproductive potential?

Proc Biol Sci. 2025 Apr;292(2044):20242301. doi: 10.1098/rspb.2024.2301. Epub 2025 Apr 9.

Pleiotropy increases with gene age in six model multicellular eukaryotes.

bioRxiv. 2024 Nov 21:2024.11.19.624372. doi: 10.1101/2024.11.19.624372.

PhyloNext: a pipeline for phylogenetic diversity analysis of GBIF-mediated data.

BMC Ecol Evol. 2024 Jun 11;24(1):76. doi: 10.1186/s12862-024-02256-9.

Speeding up iterative applications of the BUILD supertree algorithm.

PeerJ. 2024 Jan 2;12:e16624. doi: 10.7717/peerj.16624. eCollection 2024.

Genomic Assessment of the Contribution of the Endosymbiont of to Gall Induction.

Int J Mol Sci. 2023 Jun 1;24(11):9613. doi: 10.3390/ijms24119613.

The electronic tree of life (eToL): a net of long probes to characterize the microbiome from RNA-seq data.

BMC Microbiol. 2022 Dec 22;22(1):317. doi: 10.1186/s12866-022-02671-2.

treedata.table: a wrapper for data.table that enables fast manipulation of large phylogenetic trees matched to data.

PeerJ. 2021 Nov 26;9:e12450. doi: 10.7717/peerj.12450. eCollection 2021.

本文引用的文献

Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

Proc Natl Acad Sci U S A. 2015 Oct 13;112(41):12764-9. doi: 10.1073/pnas.1423041112. Epub 2015 Sep 18.

Phylesystem: a git-based data store for community-curated phylogenetic estimates.

Bioinformatics. 2015 Sep 1;31(17):2794-800. doi: 10.1093/bioinformatics/btv276. Epub 2015 May 4.

Reweaving the tapestry: a supertree of birds.

PLoS Curr. 2014 Jun 9;6:ecurrents.tol.c1af68dda7c999ed9f1e4b2d2df7a08e. doi: 10.1371/currents.tol.c1af68dda7c999ed9f1e4b2d2df7a08e.

The delayed rise of present-day mammals.

Nature. 2007 Mar 29;446(7135):507-12. doi: 10.1038/nature05634.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于汇总数百万物种的系统发育和分类信息的超级树管道。

A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献