• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于匹配的系统发育树度量标准。

A metric for phylogenetic trees based on matching.

机构信息

Laboratory for Computational Biology and Bioinformatics, School of Computer and Communication Sciences, Swiss Federal Institute of Technology-EPFL, INJ 211, Station 14, Lausanne CH-1015, Switzerland.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1014-22. doi: 10.1109/TCBB.2011.157.

DOI:10.1109/TCBB.2011.157
PMID:22184263
Abstract

Comparing two or more phylogenetic trees is a fundamental task in computational biology. The simplest outcome of such a comparison is a pairwise measure of similarity, dissimilarity, or distance. A large number of such measures have been proposed, but so far all suffer from problems varying from computational cost to lack of robustness; many can be shown to behave unexpectedly under certain plausible inputs. For instance, the widely used Robinson-Foulds distance is poorly distributed and thus affords little discrimination, while also lacking robustness in the face of very small changes--reattaching a single leaf elsewhere in a tree of any size can instantly maximize the distance. In this paper, we introduce a new pairwise distance measure, based on matching, for phylogenetic trees. We prove that our measure induces a metric on the space of trees, show how to compute it in low polynomial time, verify through statistical testing that it is robust, and finally note that it does not exhibit unexpected behavior under the same inputs that cause problems with other measures. We also illustrate its usefulness in clustering trees, demonstrating significant improvements in the quality of hierarchical clustering as compared to the same collections of trees clustered using the Robinson-Foulds distance.

摘要

比较两个或多个系统发育树是计算生物学中的一项基本任务。这种比较的最简单结果是相似性、相异性或距离的两两度量。已经提出了大量这样的度量方法,但到目前为止,所有这些方法都存在从计算成本到缺乏稳健性等问题;许多方法在某些合理的输入下表现出出乎意料的行为。例如,广泛使用的罗宾逊-福尔德距离分布不佳,因此区分度不大,而且在面对非常小的变化时也缺乏稳健性——在任何大小的树中重新连接单个叶子都会立即最大化距离。在本文中,我们为系统发育树引入了一种新的基于匹配的成对距离度量。我们证明了我们的度量在树上的空间中诱导出一个度量,展示了如何在低多项式时间内计算它,通过统计测试验证了它的稳健性,最后注意到它在相同的输入下不会表现出与其他度量方法相同的异常行为。我们还说明了它在聚类树中的有用性,与使用罗宾逊-福尔德距离对相同的树集合进行聚类相比,它显著提高了层次聚类的质量。

相似文献

1
A metric for phylogenetic trees based on matching.基于匹配的系统发育树度量标准。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1014-22. doi: 10.1109/TCBB.2011.157.
2
Metrics for phylogenetic networks I: generalizations of the Robinson-Foulds metric.系统发育网络的度量标准 I:罗宾逊 - 福尔兹度量标准的推广
IEEE/ACM Trans Comput Biol Bioinform. 2009 Jan-Mar;6(1):46-61. doi: 10.1109/TCBB.2008.70.
3
The Generalized Robinson-Foulds Distance for Phylogenetic Trees.系统发育树的广义 Robinson-Foulds 距离。
J Comput Biol. 2021 Dec;28(12):1181-1195. doi: 10.1089/cmb.2021.0342. Epub 2021 Oct 29.
4
The -Robinson-Foulds Dissimilarity Measures for Comparison of Labeled Trees.用于比较带标签树的罗宾逊 - 福尔兹差异度量
J Comput Biol. 2024 Apr;31(4):328-344. doi: 10.1089/cmb.2023.0312. Epub 2024 Jan 25.
5
Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions.通过使用分区之间的转移距离匹配节点来比较系统发育树。
J Comput Biol. 2017 May;24(5):422-435. doi: 10.1089/cmb.2016.0204. Epub 2017 Feb 8.
6
MASTtreedist: visualization of tree space based on maximum agreement subtree.MAST树状图距离:基于最大一致子树的树空间可视化。
J Comput Biol. 2013 Jan;20(1):42-9. doi: 10.1089/cmb.2012.0243.
7
An efficient algorithm for approximating geodesic distances in tree space.一种用于逼近树空间测地距离的有效算法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1196-207. doi: 10.1109/TCBB.2010.121.
8
A Linear Time Solution to the Labeled Robinson-Foulds Distance Problem.线性时间解决带标签的罗宾逊-福尔德斯距离问题。
Syst Biol. 2022 Oct 12;71(6):1391-1403. doi: 10.1093/sysbio/syac028.
9
Efficiently computing the Robinson-Foulds metric.高效计算罗宾逊-福尔兹度量。
J Comput Biol. 2007 Jul-Aug;14(6):724-35. doi: 10.1089/cmb.2007.R012.
10
Asymmetric Cluster-Based Measures for Comparative Phylogenetics.用于比较系统发育学的基于非对称聚类的度量方法。
J Comput Biol. 2024 Apr;31(4):312-327. doi: 10.1089/cmb.2023.0338. Epub 2024 Apr 17.

引用本文的文献

1
The path-label reconciliation (PLR) dissimilarity measure for gene trees.用于基因树的路径标签协调(PLR)差异度量。
Algorithms Mol Biol. 2025 Aug 19;20(1):16. doi: 10.1186/s13015-025-00284-8.
2
Sparse Neighbor Joining: rapid phylogenetic inference using a sparse distance matrix.稀疏邻接法:使用稀疏距离矩阵进行快速系统发育推断。
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae701.
3
Spectral cluster supertree: fast and statistically robust merging of rooted phylogenetic trees.光谱聚类超树:有根系统发育树的快速且统计稳健的合并
Front Mol Biosci. 2024 Oct 30;11:1432495. doi: 10.3389/fmolb.2024.1432495. eCollection 2024.
4
Asymmetric Cluster-Based Measures for Comparative Phylogenetics.用于比较系统发育学的基于非对称聚类的度量方法。
J Comput Biol. 2024 Apr;31(4):312-327. doi: 10.1089/cmb.2023.0338. Epub 2024 Apr 17.
5
Robust expansion of phylogeny for fast-growing genome sequence data.快速增长的基因组序列数据的系统发育稳健扩展。
PLoS Comput Biol. 2024 Feb 8;20(2):e1011871. doi: 10.1371/journal.pcbi.1011871. eCollection 2024 Feb.
6
Optimizing ancestral trait reconstruction of large HIV Subtype C datasets through multiple-trait subsampling.通过多性状子采样优化大型HIV C亚型数据集的祖先性状重建
Virus Evol. 2023 Nov 22;9(2):vead069. doi: 10.1093/ve/vead069. eCollection 2023.
7
Divergent vertebral formulae shape the evolution of axial complexity in mammals.不同的脊椎公式塑造了哺乳动物轴性复杂性的演化。
Nat Ecol Evol. 2023 Mar;7(3):367-381. doi: 10.1038/s41559-023-01982-5. Epub 2023 Mar 6.
8
The Structure of Evolutionary Model Space for Proteins across the Tree of Life.生命之树上蛋白质的进化模型空间结构
Biology (Basel). 2023 Feb 10;12(2):282. doi: 10.3390/biology12020282.
9
Phylogenies from unaligned proteomes using sequence environments of amino acid residues.使用氨基酸残基的序列环境从未对齐的蛋白质组中进行系统发育分析。
Sci Rep. 2022 May 6;12(1):7497. doi: 10.1038/s41598-022-11370-x.
10
A Linear Time Solution to the Labeled Robinson-Foulds Distance Problem.线性时间解决带标签的罗宾逊-福尔德斯距离问题。
Syst Biol. 2022 Oct 12;71(6):1391-1403. doi: 10.1093/sysbio/syac028.