枚举系统发育树集合中的所有最大频繁子树。

Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

作者信息

Deepak Akshay, Fernández-Baca David

机构信息

Department of Electrical and Computer Engineering, Iowa State University, Ames, Iowa, USA.

Department of Computer Science, Iowa State University, Ames, Iowa, USA.

出版信息

Algorithms Mol Biol. 2014 Jun 18;9:16. doi: 10.1186/1748-7188-9-16. eCollection 2014.

DOI:10.1186/1748-7188-9-16

PMID:25061474

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4085724/

Abstract

BACKGROUND

A common problem in phylogenetic analysis is to identify frequent patterns in a collection of phylogenetic trees. The goal is, roughly, to find a subset of the species (taxa) on which all or some significant subset of the trees agree. One popular method to do so is through maximum agreement subtrees (MASTs). MASTs are also used, among other things, as a metric for comparing phylogenetic trees, computing congruence indices and to identify horizontal gene transfer events.

RESULTS

We give algorithms and experimental results for two approaches to identify common patterns in a collection of phylogenetic trees, one based on agreement subtrees, called maximal agreement subtrees, the other on frequent subtrees, called maximal frequent subtrees. These approaches can return subtrees on larger sets of taxa than MASTs, and can reveal new common phylogenetic relationships not present in either MASTs or the majority rule tree (a popular consensus method). Our current implementation is available on the web at https://code.google.com/p/mfst-miner/.

CONCLUSIONS

Our computational results confirm that maximal agreement subtrees and all maximal frequent subtrees can reveal a more complete phylogenetic picture of the common patterns in collections of phylogenetic trees than maximum agreement subtrees; they are also often more resolved than the majority rule tree. Further, our experiments show that enumerating maximal frequent subtrees is considerably more practical than enumerating ordinary (not necessarily maximal) frequent subtrees.

摘要

背景

系统发育分析中的一个常见问题是在一组系统发育树中识别频繁出现的模式。大致目标是找到一个物种（分类单元）子集，所有或一些重要的树子集在该子集上达成一致。一种常用的方法是通过最大一致子树（MAST）。MAST还用于其他方面，例如作为比较系统发育树的度量、计算一致性指数以及识别水平基因转移事件。

结果

我们给出了两种在一组系统发育树中识别常见模式的方法的算法和实验结果，一种基于一致子树，称为最大一致子树，另一种基于频繁子树，称为最大频繁子树。这些方法可以返回比MAST更大分类单元集上的子树，并且可以揭示MAST或多数规则树（一种流行的共识方法）中不存在的新的常见系统发育关系。我们当前的实现可在https://code.google.com/p/mfst-miner/网站上获取。

结论

我们的计算结果证实，与最大一致子树相比，最大一致子树和所有最大频繁子树能够揭示系统发育树集合中常见模式更完整的系统发育图景；它们通常也比多数规则树的解析度更高。此外，我们的实验表明，枚举最大频繁子树比枚举普通（不一定是最大的）频繁子树要实用得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ccc/4085724/1defd8ba9819/1748-7188-9-16-1.jpg

相似文献

Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

Algorithms Mol Biol. 2014 Jun 18;9:16. doi: 10.1186/1748-7188-9-16. eCollection 2014.

A Faster Algorithm for Computing the Kernel of Maximum Agreement Subtrees.

IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):416-430. doi: 10.1109/TCBB.2019.2922955. Epub 2021 Apr 6.

A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

BMC Bioinformatics. 2012 Oct 3;13:256. doi: 10.1186/1471-2105-13-256.

Calculation, visualization, and manipulation of MASTs (Maximum Agreement Subtrees).

Proc IEEE Comput Syst Bioinform Conf. 2004:405-14. doi: 10.1109/csb.2004.1332453.

The Kernel of Maximum Agreement Subtrees.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1023-31. doi: 10.1109/TCBB.2012.11.

A reconstruction problem for a class of phylogenetic networks with lateral gene transfers.

Algorithms Mol Biol. 2015 Dec 2;10:28. doi: 10.1186/s13015-015-0059-z. eCollection 2015.

Summarizing a posterior distribution of trees using agreement subtrees.

Syst Biol. 2007 Aug;56(4):578-90. doi: 10.1080/10635150701485091.

MASTtreedist: visualization of tree space based on maximum agreement subtree.

J Comput Biol. 2013 Jan;20(1):42-9. doi: 10.1089/cmb.2012.0243.

A congruence index for testing topological similarity between trees.

Bioinformatics. 2007 Dec 1;23(23):3119-24. doi: 10.1093/bioinformatics/btm500. Epub 2007 Oct 12.

Improved parameterized complexity of the maximum agreement subtree and maximum compatible tree problems.

IEEE/ACM Trans Comput Biol Bioinform. 2006 Jul-Sep;3(3):289-302. doi: 10.1109/TCBB.2006.39.

本文引用的文献

FlatNJ: a novel network-based approach to visualize evolutionary and biogeographical relationships.

Syst Biol. 2014 May;63(3):383-96. doi: 10.1093/sysbio/syu001. Epub 2014 Jan 15.

SuperQ: computing supernetworks from quartets.

IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):151-60. doi: 10.1109/TCBB.2013.8.

Generating Functions for Multi-labeled Trees.

Discrete Appl Math. 2013 Jan 1;161(1-2):107-117. doi: 10.1016/j.dam.2012.08.010. Epub 2012 Sep 4.

A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

BMC Bioinformatics. 2012 Oct 3;13:256. doi: 10.1186/1471-2105-13-256.

Constructing and drawing regular planar split networks.

IEEE/ACM Trans Comput Biol Bioinform. 2012;9(2):395-407. doi: 10.1109/TCBB.2011.115. Epub 2011 Aug 4.

Terraces in phylogenetic tree space.

Science. 2011 Jul 22;333(6041):448-50. doi: 10.1126/science.1206357. Epub 2011 Jun 16.

Uncovering hidden phylogenetic consensus in large data sets.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):902-11. doi: 10.1109/TCBB.2011.28.

Fixed-parameter tractability of the maximum agreement supertree problem.

IEEE/ACM Trans Comput Biol Bioinform. 2010 Apr-Jun;7(2):342-53. doi: 10.1109/TCBB.2008.93.

Inferring polyploid phylogenies from multiply-labeled gene trees.

BMC Evol Biol. 2009 Aug 28;9:216. doi: 10.1186/1471-2148-9-216.

Fast structural search in phylogenetic databases.

Evol Bioinform Online. 2007 Feb 20;1:37-46.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

枚举系统发育树集合中的所有最大频繁子树。

Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献