系统发育基因组学推断中常见拓扑重排对划分树的影响。

Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

作者信息

Chernomor Olga, Minh Bui Quang, von Haeseler Arndt

机构信息

1 Max F. Perutz Laboratories, Center for Integrative Bioinformatics Vienna, University of Vienna , Vienna, Austria .

2 Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna , Vienna, Austria .

出版信息

J Comput Biol. 2015 Dec;22(12):1129-42. doi: 10.1089/cmb.2015.0146. Epub 2015 Oct 8.

DOI:10.1089/cmb.2015.0146

PMID:26448206

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4663649/

Abstract

In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference.

摘要

在系统发育基因组学分析中，具有相同分数（最大似然分数或简约分数）的树的集合可能会妨碍树搜索算法。这样的集合被称为系统发育阶地。对于具有大量缺失数据的稀疏超级矩阵，阶地的数量以及阶地上树的数量可能非常大。如果不考虑阶地，可能会不必要地花费大量计算时间来评估许多实际上具有相同分数的树。为了在树搜索过程中节省计算时间，快速识别这种情况是值得的。物种树的分数是所有所谓诱导划分树的分数之和。因此，如果应用于物种树的拓扑重排不改变诱导划分树，这些划分树的分数就不会改变。在这里，我们提供了三种最广泛使用的拓扑重排（最近邻交换、子树剪枝与重接以及树二分与重连）改变诱导划分树拓扑的条件。在树搜索过程中，这些条件使我们能够快速确定是否可以在评估新遇到的树时节省计算时间。我们还引入了部分阶地的概念，并证明它们比原始的“完整”阶地出现得更频繁。因此，与完整阶地相比，部分阶地是更重要的节省时间的因素。因此，考虑上述条件和部分阶地概念将有助于加快系统发育基因组学推断中的树搜索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5223/4663649/c0e0882598aa/fig-1.jpg

相似文献

Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

J Comput Biol. 2015 Dec;22(12):1129-42. doi: 10.1089/cmb.2015.0146. Epub 2015 Oct 8.

Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices.

Syst Biol. 2016 Nov;65(6):997-1008. doi: 10.1093/sysbio/syw037. Epub 2016 Apr 26.

The prevalence of terraced treescapes in analyses of phylogenetic data sets.

BMC Evol Biol. 2018 Apr 4;18(1):46. doi: 10.1186/s12862-018-1162-9.

The prevalence of multifurcations in tree-space and their implications for tree-search.

Mol Biol Evol. 2010 Dec;27(12):2674-7. doi: 10.1093/molbev/msq163. Epub 2010 Jun 28.

Impacts of Terraces on Phylogenetic Inference.

Syst Biol. 2015 Sep;64(5):709-26. doi: 10.1093/sysbio/syv024. Epub 2015 May 20.

Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used.

Mol Biol Evol. 2000 Aug;17(8):1251-8. doi: 10.1093/oxfordjournals.molbev.a026408.

Progressive tree neighborhood applied to the maximum parsimony problem.

IEEE/ACM Trans Comput Biol Bioinform. 2008 Jan-Mar;5(1):136-45. doi: 10.1109/TCBB.2007.1065.

morePhyML: improving the phylogenetic tree space exploration with PhyML 3.

Mol Phylogenet Evol. 2011 Dec;61(3):944-8. doi: 10.1016/j.ympev.2011.08.029. Epub 2011 Sep 8.

Terraces in phylogenetic tree space.

Science. 2011 Jul 22;333(6041):448-50. doi: 10.1126/science.1206357. Epub 2011 Jun 16.

Characterizing the phylogenetic tree-search problem.

Syst Biol. 2012 Mar;61(2):228-39. doi: 10.1093/sysbio/syr097. Epub 2011 Nov 10.

引用本文的文献

Gentrius: Generating Trees Compatible With a Set of Unrooted Subtrees and its Application to Phylogenetic Terraces.

Mol Biol Evol. 2024 Nov 1;41(11). doi: 10.1093/molbev/msae219.

On Defining and Finding Islands of Trees and Mitigating Large Island Bias.

Syst Biol. 2021 Oct 13;70(6):1282-1294. doi: 10.1093/sysbio/syab015.

IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.

Mol Biol Evol. 2020 May 1;37(5):1530-1534. doi: 10.1093/molbev/msaa015.

Integrative taxonomy resolves taxonomic uncertainty for freshwater mussels being considered for protection under the U.S. Endangered Species Act.

Sci Rep. 2018 Oct 26;8(1):15892. doi: 10.1038/s41598-018-33806-z.

Two C++ libraries for counting trees on a phylogenetic terrace.

Bioinformatics. 2018 Oct 1;34(19):3399-3401. doi: 10.1093/bioinformatics/bty384.

The prevalence of terraced treescapes in analyses of phylogenetic data sets.

BMC Evol Biol. 2018 Apr 4;18(1):46. doi: 10.1186/s12862-018-1162-9.

Terrace Aware Data Structure for Phylogenomic Inference from Supermatrices.

Syst Biol. 2016 Nov;65(6):997-1008. doi: 10.1093/sysbio/syw037. Epub 2016 Apr 26.

W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis.

Nucleic Acids Res. 2016 Jul 8;44(W1):W232-5. doi: 10.1093/nar/gkw256. Epub 2016 Apr 15.

本文引用的文献

Impacts of Terraces on Phylogenetic Inference.

Syst Biol. 2015 Sep;64(5):709-26. doi: 10.1093/sysbio/syv024. Epub 2015 May 20.

IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.

Mol Biol Evol. 2015 Jan;32(1):268-74. doi: 10.1093/molbev/msu300. Epub 2014 Nov 3.

The bee tree of life: a supermatrix approach to apoid phylogeny and biogeography.

BMC Evol Biol. 2013 Jul 3;13:138. doi: 10.1186/1471-2148-13-138.

Macroevolutionary dynamics and historical biogeography of primate diversification inferred from a species supermatrix.

PLoS One. 2012;7(11):e49521. doi: 10.1371/journal.pone.0049521. Epub 2012 Nov 16.

Updating the evolutionary history of Carnivora (Mammalia): a new species-level supertree complete with divergence time estimates.

BMC Biol. 2012 Feb 27;10:12. doi: 10.1186/1741-7007-10-12.

The taming of an impossible child: a standardized all-in approach to the phylogeny of Hymenoptera using public database sequences.

BMC Biol. 2011 Aug 18;9:55. doi: 10.1186/1741-7007-9-55.

A large-scale phylogeny of Amphibia including over 2800 species, and a revised classification of extant frogs, salamanders, and caecilians.

Mol Phylogenet Evol. 2011 Nov;61(2):543-83. doi: 10.1016/j.ympev.2011.06.012. Epub 2011 Jun 23.

Terraces in phylogenetic tree space.

Science. 2011 Jul 22;333(6041):448-50. doi: 10.1126/science.1206357. Epub 2011 Jun 16.

Phylogenetic supertrees: Assembling the trees of life.

Trends Ecol Evol. 1998 Mar;13(3):105-9. doi: 10.1016/S0169-5347(97)01242-1.

The phylogeny of advanced snakes (Colubroidea), with discovery of a new subfamily and comparison of support methods for likelihood trees.

Mol Phylogenet Evol. 2011 Feb;58(2):329-42. doi: 10.1016/j.ympev.2010.11.006. Epub 2010 Nov 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

系统发育基因组学推断中常见拓扑重排对划分树的影响。

Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

作者信息

Chernomor Olga, Minh Bui Quang, von Haeseler Arndt

机构信息

1 Max F. Perutz Laboratories, Center for Integrative Bioinformatics Vienna, University of Vienna , Vienna, Austria .

2 Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna , Vienna, Austria .

出版信息

J Comput Biol. 2015 Dec;22(12):1129-42. doi: 10.1089/cmb.2015.0146. Epub 2015 Oct 8.

DOI:10.1089/cmb.2015.0146

PMID:26448206

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4663649/

Abstract

摘要

系统发育基因组学推断中常见拓扑重排对划分树的影响。

Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

系统发育基因组学推断中常见拓扑重排对划分树的影响。

Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

作者信息

机构信息

出版信息