系统发育树的Cophenetic 度量，继 Sokal 和 Rohlf 之后。

Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf.

机构信息

Department of Mathematics and Computer Science, University of the Balearic Islands, E-07122 Palma de Mallorca, Spain.

出版信息

BMC Bioinformatics. 2013 Jan 16;14:3. doi: 10.1186/1471-2105-14-3.

DOI:10.1186/1471-2105-14-3

PMID:23323711

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3716993/

Abstract

BACKGROUND

Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa.

RESULTS

For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics dφ,p by comparing these cophenetic vectors by means of Lp norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics.

CONCLUSIONS

The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n2) time, where n stands for the number of taxa. The metrics dφ,1 and dφ,2 have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of dφ,p, for p⩾1 , is in O(n(p+2)/p), and thus for low p they are more discriminative, having a wider range of values.

摘要

背景

系统发育树比较度量是进化研究中的重要工具，因此此类度量的定义是系统发育学中的一个有趣问题。五十年前，Sokal 和 Rohlf 在《Taxon》一文中提出，通过使用它们的协方差值半矩阵对一对系统发育树进行编码，然后比较这些矩阵，从而定量测量一对系统发育树之间的差异。从那时起，这个想法已经被多次用于定义系统发育树之间的不相似性度量，但据我们所知，基于这个想法，尚未正式定义和研究过加权具有嵌套分类单元的系统发育树的适当度量。实际上，仅不同分类单元的协方差值不足以单独挑选出具有加权弧或嵌套分类单元的系统发育树。

结果

对于每棵（有根的）系统发育树 T，让它的协方差向量φ(T) 由 T 中分类单元对之间的所有对协方差值和 T 中分类单元的所有深度组成。事实证明，这些协方差向量可以挑选出具有嵌套分类单元的加权系统发育树。然后，我们通过 Lp 范数比较这些协方差向量来定义一个协方差度量族 dφ,p，并分析或数值地研究它们的一些基本性质：邻居、直径、分布以及它们彼此之间以及与其他度量之间的秩相关性。

结论

协方差度量可以安全地用于具有嵌套分类单元的加权系统发育树，并且不受度的限制，并且可以在 O(n2) 时间内计算，其中 n 表示分类单元的数量。度量 dφ,1 和 dφ,2 具有正偏态分布，它们与 Robinson-Foulds 度量和节点度量的秩相关性较低，与分裂节点度量的相关性很高，与彼此的相关性也很高。对于 p ⩾ 1，dφ,p 的直径为 O(n(p+2)/p)，因此对于低 p，它们的区分度更高，具有更宽的取值范围。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/12a0/3716993/8d0b2573cac8/1471-2105-14-3-2.jpg

相似文献

Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf.系统发育树的Cophenetic 度量，继 Sokal 和 Rohlf 之后。

BMC Bioinformatics. 2013 Jan 16;14:3. doi: 10.1186/1471-2105-14-3.

The expected value of the squared cophenetic metric under the Yule and the uniform models.在 Yule 模型和均匀模型下，平方Cophenetic 度量的期望值。

Math Biosci. 2018 Jan;295:73-85. doi: 10.1016/j.mbs.2017.11.007. Epub 2017 Nov 16.

Nodal distances for rooted phylogenetic trees.有根系统发育树的节点距离。

J Math Biol. 2010 Aug;61(2):253-276. doi: 10.1007/s00285-009-0295-2. Epub 2009 Sep 16.

On Sackin's original proposal: the variance of the leaves' depths as a phylogenetic balance index.关于萨克因的原始提议：叶片深度的方差作为系统发育平衡指数。

BMC Bioinformatics. 2020 Apr 23;21(1):154. doi: 10.1186/s12859-020-3405-1.

The Generalized Robinson-Foulds Distance for Phylogenetic Trees.系统发育树的广义 Robinson-Foulds 距离。

J Comput Biol. 2021 Dec;28(12):1181-1195. doi: 10.1089/cmb.2021.0342. Epub 2021 Oct 29.

Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions.通过使用分区之间的转移距离匹配节点来比较系统发育树。

J Comput Biol. 2017 May;24(5):422-435. doi: 10.1089/cmb.2016.0204. Epub 2017 Feb 8.

Cophenetic Median Trees.吻合系数中位数树。

IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1459-1470. doi: 10.1109/TCBB.2018.2870173. Epub 2018 Sep 13.

Metrics for phylogenetic networks I: generalizations of the Robinson-Foulds metric.系统发育网络的度量标准 I：罗宾逊 - 福尔兹度量标准的推广

IEEE/ACM Trans Comput Biol Bioinform. 2009 Jan-Mar;6(1):46-61. doi: 10.1109/TCBB.2008.70.

Comparison of phylogenetic trees defined on different but mutually overlapping sets of taxa: A review.在不同但相互重叠的分类单元集上定义的系统发育树的比较：综述。

Ecol Evol. 2024 Aug 8;14(8):e70054. doi: 10.1002/ece3.70054. eCollection 2024 Aug.

Generalization of Phylogenetic Matching Metrics with Experimental Tests of Practical Advantages.用实验测试实际优势来推广系统发育匹配测度。

J Comput Biol. 2023 Mar;30(3):261-276. doi: 10.1089/cmb.2022.0090. Epub 2022 Dec 20.

引用本文的文献

Phylo-rs: an extensible phylogenetic analysis library in rust.Phylo-rs：一个用Rust编写的可扩展系统发育分析库。

BMC Bioinformatics. 2025 Jul 29;26(1):197. doi: 10.1186/s12859-025-06234-w.

Computing generalized cophenetic distances under all Lp norms: A near-linear time algorithmic framework.在所有Lp范数下计算广义共谱距离：一个近线性时间算法框架。

PLoS Comput Biol. 2025 Jun 10;21(6):e1013069. doi: 10.1371/journal.pcbi.1013069. eCollection 2025 Jun.

Genomic epidemiology of Enteritidis human infections in the Netherlands, 2019 to 2023.2019年至2023年荷兰肠炎沙门氏菌人类感染的基因组流行病学

Microb Genom. 2025 Apr;11(4). doi: 10.1099/mgen.0.001394.

Mitochondrial DNA for Phylogeny Building: Assessing Individual and Grouped mtGenes as Proxies for the mtGenome in Platyrrhines.用于系统发育构建的线粒体DNA：评估个体和分组的线粒体基因作为阔鼻猴线粒体基因组的替代指标

Am J Primatol. 2025 Mar;87(3):e70017. doi: 10.1002/ajp.70017.

Trans-Specific Polymorphisms Between Cryptic Daphnia Species Affect Fitness and Behavior.隐种水蚤之间的跨物种多态性影响适应性和行为。

Mol Ecol. 2025 Feb;34(3):e17632. doi: 10.1111/mec.17632. Epub 2024 Dec 24.

Balancing selection and the functional effects of shared polymorphism in cryptic species.平衡选择与隐性物种中共享多态性的功能效应。

bioRxiv. 2024 Apr 20:2024.04.16.589693. doi: 10.1101/2024.04.16.589693.

Comparative study of encoded and alignment-based methods for virus taxonomy classification.基于编码和比对的病毒分类学方法比较研究。

Sci Rep. 2023 Oct 31;13(1):18662. doi: 10.1038/s41598-023-45461-0.

PROBABILISTIC LEARNING OF TREATMENT TREES IN CANCER.癌症治疗树的概率学习

Ann Appl Stat. 2023 Sep;17(3):1884-1908. doi: 10.1214/22-aoas1696. Epub 2023 Sep 7.

Non-Invasive Mapping of Cerebral Autoregulation Using Near-Infrared Spectroscopy: A Study Protocol.使用近红外光谱技术对脑自动调节进行无创映射：一项研究方案

Methods Protoc. 2023 Jun 9;6(3):58. doi: 10.3390/mps6030058.

A convenient correspondence between k-mer-based metagenomic distances and phylogenetically-informed β-diversity measures.基于 k-mer 的宏基因组距离与基于系统发育信息的 β 多样性测度之间的便捷对应关系。

PLoS Comput Biol. 2023 Jan 6;19(1):e1010821. doi: 10.1371/journal.pcbi.1010821. eCollection 2023 Jan.

本文引用的文献

Nodal distances for rooted phylogenetic trees.有根系统发育树的节点距离。

J Math Biol. 2010 Aug;61(2):253-276. doi: 10.1007/s00285-009-0295-2. Epub 2009 Sep 16.

J Chem Inf Model. 2007 May-Jun;47(3):761-70. doi: 10.1021/ci6005189. Epub 2007 Apr 28.

Computational cluster validation in post-genomic data analysis.后基因组数据分析中的计算聚类验证

Bioinformatics. 2005 Aug 1;21(15):3201-12. doi: 10.1093/bioinformatics/bti517. Epub 2005 May 24.

Numerical taxonomy.数值分类学

Nature. 1962 Mar 3;193:855-60. doi: 10.1038/193855a0.

Inferring the historical patterns of biological evolution.推断生物进化的历史模式。

Nature. 1999 Oct 28;401(6756):877-84. doi: 10.1038/44766.

A general method for tree-comparison based on subtree similarity and its use in a taxonomic database.

Biosystems. 1997;42(1):1-8. doi: 10.1016/s0303-2647(97)01684-5.

The agreement metric for labeled binary trees.

Math Biosci. 1994 Oct;123(2):215-26. doi: 10.1016/0025-5564(94)90012-4.

Phylogenetic inference under the pure drift model.

Mol Biol Evol. 1994 Nov;11(6):949-60. doi: 10.1093/oxfordjournals.molbev.a040175.

Hierarchical clustering schemes.层次聚类方案。

Psychometrika. 1967 Sep;32(3):241-54. doi: 10.1007/BF02289588.

J Theor Biol. 1978 Aug 21;73(4):789-800. doi: 10.1016/0022-5193(78)90137-6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

系统发育树的Cophenetic 度量，继 Sokal 和 Rohlf 之后。

Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献