• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用约束增量法进行大规模基因树和物种树估计。

Using Constrained-INC for Large-Scale Gene Tree and Species Tree Estimation.

作者信息

Le Thien, Sy Aaron, Molloy Erin K, Zhang Qiuyi, Rao Satish, Warnow Tandy

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):2-15. doi: 10.1109/TCBB.2020.2990867. Epub 2021 Feb 3.

DOI:10.1109/TCBB.2020.2990867
PMID:32750844
Abstract

Incremental tree building (INC) is a new phylogeny estimation method that has been proven to be absolute fast converging under standard sequence evolution models. A variant of INC, called Constrained-INC, is designed for use in divide-and-conquer pipelines for phylogeny estimation where a set of species is divided into disjoint subsets, trees are computed on the subsets using a selected base method, and then the subset trees are combined together. We evaluate the accuracy of INC and Constrained-INC for gene tree and species tree estimation on simulated datasets, and compare it to similar pipelines using NJMerge (another method that merges disjoint trees). For gene tree estimation, we find that INC has very poor accuracy in comparison to standard methods, and even Constrained-INC(using maximum likelihood methods to compute constraint trees) does not match the accuracy of the better maximum likelihood methods. Results for species trees are somewhat different, with Constrained-INC coming close to the accuracy of the best species tree estimation methods, while being much faster; furthermore, using Constrained-INC allows species tree estimation methods to scale to large datasets within limited computational resources. Overall, this study exposes the benefits and limitations of divide-and-conquer strategies for large-scale phylogenetic tree estimation.

摘要

增量树构建(INC)是一种新的系统发育估计方法,已被证明在标准序列进化模型下绝对快速收敛。INC的一个变体,称为约束INC,设计用于系统发育估计的分治管道,其中一组物种被划分为不相交的子集,使用选定的基本方法在子集上计算树,然后将子集树组合在一起。我们在模拟数据集上评估了INC和约束INC在基因树和物种树估计方面的准确性,并将其与使用NJMerge(另一种合并不相交树的方法)的类似管道进行了比较。对于基因树估计,我们发现与标准方法相比,INC的准确性非常差,甚至约束INC(使用最大似然方法计算约束树)也无法与更好的最大似然方法的准确性相匹配。物种树的结果有所不同,约束INC接近最佳物种树估计方法的准确性,同时速度要快得多;此外,使用约束INC允许物种树估计方法在有限的计算资源内扩展到大型数据集。总体而言,这项研究揭示了分治策略在大规模系统发育树估计中的优点和局限性。

相似文献

1
Using Constrained-INC for Large-Scale Gene Tree and Species Tree Estimation.使用约束增量法进行大规模基因树和物种树估计。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):2-15. doi: 10.1109/TCBB.2020.2990867. Epub 2021 Feb 3.
2
Statistically consistent divide-and-conquer pipelines for phylogeny estimation using NJMerge.使用NJMerge进行系统发育估计的统计上一致的分治管道。
Algorithms Mol Biol. 2019 Jul 19;14:14. doi: 10.1186/s13015-019-0151-x. eCollection 2019.
3
Constrained incremental tree building: new absolute fast converging phylogeny estimation methods with improved scalability and accuracy.约束增量树构建:具有改进的可扩展性和准确性的新型绝对快速收敛系统发育估计方法。
Algorithms Mol Biol. 2019 Feb 6;14:2. doi: 10.1186/s13015-019-0136-9. eCollection 2019.
4
Unblended disjoint tree merging using GTM improves species tree estimation.使用 GTM 的非混合不相交树合并可提高物种树估计的准确性。
BMC Genomics. 2020 Apr 16;21(Suppl 2):235. doi: 10.1186/s12864-020-6605-1.
5
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
6
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
7
DACTAL: divide-and-conquer trees (almost) without alignments.DACTAL:无需对齐的分而治之树(几乎)。
Bioinformatics. 2012 Jun 15;28(12):i274-82. doi: 10.1093/bioinformatics/bts218.
8
RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.RAxML 和 FastTree:比较两种大规模最大似然系统发育估计方法。
PLoS One. 2011;6(11):e27731. doi: 10.1371/journal.pone.0027731. Epub 2011 Nov 21.
9
TreeMerge: a new method for improving the scalability of species tree estimation methods.TreeMerge:一种提高物种树估计方法可扩展性的新方法。
Bioinformatics. 2019 Jul 15;35(14):i417-i426. doi: 10.1093/bioinformatics/btz344.
10
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.最大似然估计物种树:系统发育推断的准确性如何取决于分歧历史和采样设计。
Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20.

引用本文的文献

1
Quartet Fiduccia-Mattheyses revisited for larger phylogenetic studies.重新探讨 Fiduccia-Mattheyses 四重奏在更大的系统发育研究中的应用。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad332.
2
Recent progress on methods for estimating and updating large phylogenies.关于估计和更新大型系统发育树的方法的最新进展。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210244. doi: 10.1098/rstb.2021.0244. Epub 2022 Aug 22.