• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在复制-缺失和深度合并成本模型下进行高效的基因组规模系统发育分析。

Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models.

机构信息

School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel.

出版信息

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2105-11-S1-S42.

DOI:10.1186/1471-2105-11-S1-S42
PMID:20122216
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3009515/
Abstract

BACKGROUND

Genomic data provide a wealth of new information for phylogenetic analysis. Yet making use of this data requires phylogenetic methods that can efficiently analyze extremely large data sets and account for processes of gene evolution, such as gene duplication and loss, incomplete lineage sorting (deep coalescence), or horizontal gene transfer, that cause incongruence among gene trees. One such approach is gene tree parsimony, which, given a set of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, the only existing algorithms for gene tree parsimony under the duplication-loss or deep coalescence reconciliation cost are prohibitively slow for large datasets.

RESULTS

We describe novel algorithms for SPR and TBR based local search heuristics under the duplication-loss cost, and we show how they can be adapted for the deep coalescence cost. These algorithms improve upon the best existing algorithms for these problems by a factor of n, where n is the number of species in the collection of gene trees. We implemented our new SPR based local search algorithm for the duplication-loss cost and demonstrate the tremendous improvement in runtime and scalability it provides compared to existing implementations. We also evaluate the performance of our algorithm on three large-scale genomic data sets.

CONCLUSION

Our new algorithms enable, for the first time, gene tree parsimony analyses of thousands of genes from hundreds of taxa using the duplication-loss and deep coalescence reconciliation costs. Thus, this work expands both the size of data sets and the range of evolutionary models that can be incorporated into genome-scale phylogenetic analyses.

摘要

背景

基因组数据为系统发育分析提供了丰富的新信息。然而,要利用这些数据,需要使用能够有效分析极其大型数据集的系统发育方法,并能够解释基因进化过程,例如基因复制和丢失、不完全谱系分选(深合并)或水平基因转移,这些过程会导致基因树之间的不一致。一种这样的方法是基因树简约法,给定一组基因树,它会寻找一个物种树,该树需要最少的进化事件来解释基因树的不一致性。然而,对于复制-丢失或深合并重定代价下的基因树简约法,唯一现有的算法对于大型数据集来说非常缓慢。

结果

我们描述了基于 SPR 和 TBR 的新的局部搜索启发式算法,用于复制-丢失代价,并且展示了如何将它们适用于深合并代价。与这些问题的现有最佳算法相比,这些算法的速度提高了 n 倍,其中 n 是基因树集合中的物种数量。我们实现了我们新的基于 SPR 的局部搜索算法,用于复制-丢失代价,并展示了它在运行时间和可扩展性方面提供的巨大改进,与现有实现相比。我们还在三个大型基因组数据集上评估了我们算法的性能。

结论

我们的新算法首次能够使用复制-丢失和深合并重定代价对来自数百个分类群的数千个基因进行基因树简约法分析。因此,这项工作扩展了数据集的大小和可以纳入基因组规模系统发育分析的进化模型的范围。

相似文献

1
Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models.在复制-缺失和深度合并成本模型下进行高效的基因组规模系统发育分析。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S42. doi: 10.1186/1471-2105-11-S1-S42.
2
Algorithms for genome-scale phylogenetics using gene tree parsimony.基于基因树简约法的基因组尺度系统发育算法。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):939-56. doi: 10.1109/TCBB.2013.103.
3
Efficient error correction algorithms for gene tree reconciliation based on duplication, duplication and loss, and deep coalescence.基于复制、复制和丢失以及深度合并的基因树 reconcile 的高效纠错算法。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S11. doi: 10.1186/1471-2105-13-S10-S11.
4
iGTP: a software package for large-scale gene tree parsimony analysis.iGTP:用于大规模基因树简约分析的软件包。
BMC Bioinformatics. 2010 Nov 23;11:574. doi: 10.1186/1471-2105-11-574.
5
From gene trees to species trees II: species tree inference by minimizing deep coalescence events.从基因树到物种树 II:通过最小化深合并事件进行物种树推断。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1685-91. doi: 10.1109/TCBB.2011.83.
6
Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem.算法:同时进行纠错和根系重建,以解决基因树协调和基因复制问题。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S14. doi: 10.1186/1471-2105-13-S10-S14.
7
Multiple Optimal Reconciliations Under the Duplication-Loss-Coalescence Model.复制-缺失-融合模型下的多重最优协调。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2144-2156. doi: 10.1109/TCBB.2019.2922337. Epub 2021 Dec 8.
8
Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree.调和空间的结构性质及其在枚举基因树和物种树之间近乎最优的调和中的应用。
BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-12-S9-S7.
9
Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees.使用标记合并树在存在基因重复、丢失和深度合并的情况下进行最简约的协调。
Genome Res. 2014 Mar;24(3):475-86. doi: 10.1101/gr.161968.113. Epub 2013 Dec 5.
10
Consensus properties for the deep coalescence problem and their application for scalable tree search.深度合并问题的一致性属性及其在可扩展树搜索中的应用。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S12. doi: 10.1186/1471-2105-13-S10-S12.

引用本文的文献

1
Statistical inconsistency of the unrooted minimize deep coalescence criterion.无根最小深度融合准则的统计不一致性。
PLoS One. 2021 May 10;16(5):e0251107. doi: 10.1371/journal.pone.0251107. eCollection 2021.
2
Exact median-tree inference for unrooted reconciliation costs.无根配准代价的精确中位数树推断。
BMC Evol Biol. 2020 Oct 28;20(Suppl 1):136. doi: 10.1186/s12862-020-01700-w.
3
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.ASTRAL-Pro:基于四重奏的系统发生树推断,即便存在基因重复。
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307. doi: 10.1093/molbev/msaa139.
4
Phylogenetic tree building in the genomic age.基因组时代的系统发育树构建。
Nat Rev Genet. 2020 Jul;21(7):428-444. doi: 10.1038/s41576-020-0233-0. Epub 2020 May 18.
5
Disentangling genetic structure for genetic monitoring of complex populations.解析复杂群体遗传监测的遗传结构。
Evol Appl. 2018 Mar 23;11(7):1149-1161. doi: 10.1111/eva.12622. eCollection 2018 Aug.
6
Clustering Genes of Common Evolutionary History.具有共同进化历史的基因聚类
Mol Biol Evol. 2016 Jun;33(6):1590-605. doi: 10.1093/molbev/msw038. Epub 2016 Feb 17.
7
The inference of gene trees with species trees.基于物种树推断基因树。
Syst Biol. 2015 Jan;64(1):e42-62. doi: 10.1093/sysbio/syu048. Epub 2014 Jul 28.
8
Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance.使用罗宾逊-福尔兹距离从不一致的多拷贝基因树推断物种树。
Algorithms Mol Biol. 2013 Nov 1;8(1):28. doi: 10.1186/1748-7188-8-28.
9
Genome-scale coestimation of species and gene trees.基因组规模的种系和基因树共估计。
Genome Res. 2013 Feb;23(2):323-30. doi: 10.1101/gr.141978.112. Epub 2012 Nov 6.
10
Consensus properties for the deep coalescence problem and their application for scalable tree search.深度合并问题的一致性属性及其在可扩展树搜索中的应用。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S12. doi: 10.1186/1471-2105-13-S10-S12.

本文引用的文献

1
Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees.基因组规模系统发生学:从 18896 个基因树推断植物的生命之树。
Syst Biol. 2011 Mar;60(2):117-25. doi: 10.1093/sysbio/syq072. Epub 2010 Dec 24.
2
Species tree inference by minimizing deep coalescences.通过最小化深度合并来推断物种树。
PLoS Comput Biol. 2009 Sep;5(9):e1000501. doi: 10.1371/journal.pcbi.1000501. Epub 2009 Sep 11.
3
Improved heuristics for minimum-flip supertree construction.改进的最小翻转超树构建启发式算法。
Evol Bioinform Online. 2007 Feb 28;2:347-56.
4
The gene-duplication problem: near-linear time algorithms for NNI-based local searches.基因复制问题:基于NNI的局部搜索的近线性时间算法
IEEE/ACM Trans Comput Biol Bioinform. 2009 Apr-Jun;6(2):221-31. doi: 10.1109/TCBB.2009.7.
5
Simultaneous Bayesian gene tree reconstruction and reconciliation analysis.同时进行贝叶斯基因树重建与和解分析。
Proc Natl Acad Sci U S A. 2009 Apr 7;106(14):5714-9. doi: 10.1073/pnas.0806251106. Epub 2009 Mar 19.
6
STEM: species tree estimation using maximum likelihood for gene trees under coalescence.STEM:在溯祖模型下使用最大似然法进行基因树物种树估计。
Bioinformatics. 2009 Apr 1;25(7):971-3. doi: 10.1093/bioinformatics/btp079. Epub 2009 Feb 10.
7
An Omega(n2/ log n) speed-up of TBR heuristics for the gene-duplication problem.用于基因复制问题的TBR启发式算法的Ω(n²/log n)加速比。
IEEE/ACM Trans Comput Biol Bioinform. 2008 Oct-Dec;5(4):514-24. doi: 10.1109/TCBB.2008.69.
8
The Apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees.顶复门全基因组系统发育:基因树间不一致性分析
Mol Biol Evol. 2008 Dec;25(12):2689-98. doi: 10.1093/molbev/msn213. Epub 2008 Sep 26.
9
Gene family evolution by duplication, speciation, and loss.通过基因复制、物种形成和基因丢失实现的基因家族进化。
J Comput Biol. 2008 Oct;15(8):1043-62. doi: 10.1089/cmb.2008.0054.
10
DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.DupTree:一个使用基因树简约法进行大规模系统发育分析的程序。
Bioinformatics. 2008 Jul 1;24(13):1540-1. doi: 10.1093/bioinformatics/btn230. Epub 2008 May 12.