从四重奏构建系统发育树：真兽类超目关系的阐明

Ben-Dor A, Chor B, Graur D, Ophir R, Pelleg D

Department of Computer Science, Technion, Haifa, Israel.

J Comput Biol. 1998 Fall;5(3):377-90. doi: 10.1089/cmb.1998.5.377.

In this work we present two new approaches for constructing phylogenetic trees. The input is a list of weighted quartets over n taxa. Each quartet is a subtree on four taxa, and its weight represents a confidence level for the specific topology. The goal is to construct a binary tree with n leaves such that the total weight of the satisfied quartets is maximized (an NP hard problem). The first approach we present is based on geometric ideas. Using semidefinite programming, we embed the n points on the n-dimensional unit sphere, while maximizing an objective function. This function depends on Euclidean distances between the four points and reflects the quartet topology. Given the embedding, we construct a binary tree by performing geometric clustering. This process is similar to the traditional neighbor joining, with the difference that the update phase retains geometric meaning: When two neighbors are joined together, their common ancestor is taken to be the center of mass of the original points. The geometric algorithm runs in poly(n) time, but there are no guarantees on the quality of its output. In contrast, our second algorithm is based on dynamic programming, and it is guaranteed to find the optimal tree (with respect to the given quartets). Its running time is a modest exponential, so it can be implemented for modest values of n. We have implemented both algorithms and ran them on real data for n = 15 taxa (14 mammalian orders and an outgroup taxon). The two resulting trees improve previously published trees and seem to be of biological relevance. On this dataset, the geometric algorithm produced a tree whose score is 98.2% of the optimal value on this input set (72.1% vs. 73.4%). This gives rise to the hope that the geometric approach will prove viable even for larger cases where the exponential, dynamic programming approach is no longer feasible.

在这项工作中，我们提出了两种构建系统发育树的新方法。输入是n个分类单元上的加权四重奏列表。每个四重奏是四个分类单元上的子树，其权重代表特定拓扑结构的置信水平。目标是构建一棵有n个叶子的二叉树，使得满足的四重奏的总权重最大化（这是一个NP难问题）。我们提出的第一种方法基于几何思想。使用半定规划，我们将n个点嵌入到n维单位球面上，同时最大化一个目标函数。这个函数取决于四个点之间的欧几里得距离，并反映四重奏拓扑结构。给定嵌入后，我们通过执行几何聚类来构建一棵二叉树。这个过程类似于传统的邻接法，不同之处在于更新阶段保留几何意义：当两个邻居连接在一起时，它们的共同祖先被视为原始点的质心。几何算法的运行时间为多项式时间（poly(n)），但其输出质量没有保证。相比之下，我们的第二种算法基于动态规划，并且保证能找到最优树（相对于给定的四重奏）。它的运行时间是适度指数级的，所以可以针对适度的n值进行实现。我们已经实现了这两种算法，并在n = 15个分类单元（14个哺乳动物目和一个外群分类单元）的真实数据上运行了它们。得到的两棵树改进了之前发表的树，并且似乎具有生物学相关性。在这个数据集上，几何算法生成的一棵树的得分是该输入集上最优值的98.2%（72.1%对73.4%）。这让人希望几何方法即使在指数级的动态规划方法不再可行的更大情况下也能证明是可行的。

相似文献

Constructing phylogenies from quartets: elucidation of eutherian superordinal relationships.

J Comput Biol. 1998 Fall;5(3):377-90. doi: 10.1089/cmb.1998.5.377.

Quartets MaxCut: a divide and conquer quartets algorithm.

IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):704-18. doi: 10.1109/TCBB.2008.133.

Using max cut to enhance rooted trees consistency.

IEEE/ACM Trans Comput Biol Bioinform. 2006 Oct-Dec;3(4):323-33. doi: 10.1109/TCBB.2006.58.

Accuracy guarantees for phylogeny reconstruction algorithms based on balanced minimum evolution.

IEEE/ACM Trans Comput Biol Bioinform. 2013 May-Jun;10(3):576-83. doi: 10.1109/TCBB.2013.39.

Clearcut: a fast implementation of relaxed neighbor joining.

Bioinformatics. 2006 Nov 15;22(22):2823-4. doi: 10.1093/bioinformatics/btl478. Epub 2006 Sep 18.

Weighted quartets phylogenetics.

Syst Biol. 2015 Mar;64(2):233-42. doi: 10.1093/sysbio/syu087. Epub 2014 Nov 19.

Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle.

J Comput Biol. 2002;9(5):687-705. doi: 10.1089/106652702761034136.

A polynomial-time algorithm computing lower and upper bounds of the rooted subtree prune and regraft distance.

J Comput Biol. 2011 May;18(5):743-57. doi: 10.1089/cmb.2010.0045. Epub 2010 Dec 18.

Reconstructing a phylogenetic level-1 network from quartets.

Bull Math Biol. 2014 Oct;76(10):2517-41. doi: 10.1007/s11538-014-0022-z. Epub 2014 Sep 19.

Integer linear programming as a tool for constructing trees from quartet data.

Comput Biol Chem. 2005 Jun;29(3):196-203. doi: 10.1016/j.compbiolchem.2005.04.001.

引用本文的文献

Quartet-net: a quartet-based method to reconstruct phylogenetic networks.

Mol Biol Evol. 2013 May;30(5):1206-17. doi: 10.1093/molbev/mst040. Epub 2013 Mar 14.

An experimental study of Quartets MaxCut and other supertree methods.

Algorithms Mol Biol. 2011 Apr 19;6:7. doi: 10.1186/1748-7188-6-7.

Encoding phylogenetic trees in terms of weighted quartets.

J Math Biol. 2008 Apr;56(4):465-77. doi: 10.1007/s00285-007-0125-3. Epub 2007 Sep 21.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Constructing phylogenies from quartets: elucidation of eutherian superordinal relationships.

J Comput Biol. 1998 Fall;5(3):377-90. doi: 10.1089/cmb.1998.5.377.

Quartets MaxCut: a divide and conquer quartets algorithm.

IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):704-18. doi: 10.1109/TCBB.2008.133.

Using max cut to enhance rooted trees consistency.

IEEE/ACM Trans Comput Biol Bioinform. 2006 Oct-Dec;3(4):323-33. doi: 10.1109/TCBB.2006.58.

Accuracy guarantees for phylogeny reconstruction algorithms based on balanced minimum evolution.

IEEE/ACM Trans Comput Biol Bioinform. 2013 May-Jun;10(3):576-83. doi: 10.1109/TCBB.2013.39.

Clearcut: a fast implementation of relaxed neighbor joining.

Bioinformatics. 2006 Nov 15;22(22):2823-4. doi: 10.1093/bioinformatics/btl478. Epub 2006 Sep 18.

Weighted quartets phylogenetics.

Syst Biol. 2015 Mar;64(2):233-42. doi: 10.1093/sysbio/syu087. Epub 2014 Nov 19.

Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle.

J Comput Biol. 2002;9(5):687-705. doi: 10.1089/106652702761034136.

A polynomial-time algorithm computing lower and upper bounds of the rooted subtree prune and regraft distance.

J Comput Biol. 2011 May;18(5):743-57. doi: 10.1089/cmb.2010.0045. Epub 2010 Dec 18.

Reconstructing a phylogenetic level-1 network from quartets.

Bull Math Biol. 2014 Oct;76(10):2517-41. doi: 10.1007/s11538-014-0022-z. Epub 2014 Sep 19.

Integer linear programming as a tool for constructing trees from quartet data.

Comput Biol Chem. 2005 Jun;29(3):196-203. doi: 10.1016/j.compbiolchem.2005.04.001.

引用本文的文献

Quartet-net: a quartet-based method to reconstruct phylogenetic networks.

Mol Biol Evol. 2013 May;30(5):1206-17. doi: 10.1093/molbev/mst040. Epub 2013 Mar 14.

An experimental study of Quartets MaxCut and other supertree methods.

Algorithms Mol Biol. 2011 Apr 19;6:7. doi: 10.1186/1748-7188-6-7.

Encoding phylogenetic trees in terms of weighted quartets.

J Math Biol. 2008 Apr;56(4):465-77. doi: 10.1007/s00285-007-0125-3. Epub 2007 Sep 21.

Constructing phylogenies from quartets: elucidation of eutherian superordinal relationships.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献