• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于评估系统发育树和层次聚类树的计算工具

Computational Tools for Evaluating Phylogenetic and Hierarchical Clustering Trees.

作者信息

Chakerian John, Holmes Susan

机构信息

Palantir Technologies.

Stanford University, Stanford, CA 94305.

出版信息

J Comput Graph Stat. 2012;21(3):581-599. doi: 10.1080/10618600.2012.640901. Epub 2012 Aug 16.

DOI:10.1080/10618600.2012.640901
PMID:32982128
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7518125/
Abstract

Inferential summaries of tree estimates are useful in the setting of evolutionary biology, where phylogenetic trees have been built from DNA data since the 1960s. In bioinformatics, psychometrics, and data mining, hierarchical clustering techniques output the same mathematical objects, and practitioners have similar questions about the stability and "generalizability" of these summaries. This article describes the implementation of the geometric distance between trees developed by Billera, Holmes, and Vogtmann (2001) equally applicable to phylogenetic trees and hierarchical clustering trees, and shows some of the applications in evaluating tree estimates. In particular, since Billera et al. (2001) have shown that the space of trees is negatively curved (called a CAT(0) space), a collection of trees can naturally be represented as a tree. We compare this representation to the Euclidean approximations of treespace made available through both a classical multidimensional scaling and a Kernel multidimensional scaling of the matrix of the distances between trees. We also provide applications of the distances between trees to hierarchical clustering trees constructed from microarrays. Our method gives a new way of evaluating the influence of both certain columns (positions, variables, or genes) and certain rows (species, observations, or arrays) on the construction of such trees. It also can provide a way of detecting heterogeneous mixtures in the input data. Supplementary materials for this article are available online.

摘要

在进化生物学领域,自20世纪60年代以来,系统发育树已根据DNA数据构建完成,对树估计值进行的推断性总结很有用。在生物信息学、心理测量学和数据挖掘中,层次聚类技术输出的是相同的数学对象,从业者对这些总结的稳定性和“可推广性”也有类似的问题。本文描述了由比勒拉、霍姆斯和沃格特曼(2001年)开发的树之间几何距离的实现方法,该方法同样适用于系统发育树和层次聚类树,并展示了其在评估树估计值方面的一些应用。特别是,由于比勒拉等人(2001年)已经表明树空间是负曲率的(称为CAT(0)空间),一组树可以自然地表示为一棵树。我们将这种表示与通过经典多维缩放和树之间距离矩阵的核多维缩放得到的树空间的欧几里得近似进行比较。我们还提供了树之间距离在由微阵列构建的层次聚类树上的应用。我们的方法提供了一种新的方式来评估某些列(位置、变量或基因)和某些行(物种、观测值或阵列)对这类树构建的影响。它还可以提供一种检测输入数据中异质混合物的方法。本文的补充材料可在线获取。

相似文献

1
Computational Tools for Evaluating Phylogenetic and Hierarchical Clustering Trees.用于评估系统发育树和层次聚类树的计算工具
J Comput Graph Stat. 2012;21(3):581-599. doi: 10.1080/10618600.2012.640901. Epub 2012 Aug 16.
2
Comparison of phylogenetic trees defined on different but mutually overlapping sets of taxa: A review.在不同但相互重叠的分类单元集上定义的系统发育树的比较:综述。
Ecol Evol. 2024 Aug 8;14(8):e70054. doi: 10.1002/ece3.70054. eCollection 2024 Aug.
3
Tropical Density Estimation of Phylogenetic Trees.系统发育树的热带密度估计
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1855-1863. doi: 10.1109/TCBB.2024.3420815. Epub 2024 Dec 10.
4
Geodesics to characterize the phylogenetic landscape.测地线刻画系统发育景观。
PLoS One. 2023 Jun 23;18(6):e0287350. doi: 10.1371/journal.pone.0287350. eCollection 2023.
5
A fast algorithm for computing geodesic distances in tree space.一种用于计算树空间测地距离的快速算法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):2-13. doi: 10.1109/TCBB.2010.3.
6
Normalizing Kernels in the Billera-Holmes-Vogtmann Treespace.规范化比尔勒-霍姆斯-沃格特曼树空间中的核。
IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1359-1365. doi: 10.1109/TCBB.2016.2565475. Epub 2016 May 10.
7
Robust Analysis of Phylogenetic Tree Space.系统发育树空间的稳健分析。
Syst Biol. 2022 Aug 10;71(5):1255-1270. doi: 10.1093/sysbio/syab100.
8
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
9
Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.不变的 Robinson 和 Foulds 距离矩阵变换用于卷积神经网络。
J Bioinform Comput Biol. 2022 Aug;20(4):2250012. doi: 10.1142/S0219720022500123. Epub 2022 Jul 6.
10
Estimating the mean in the space of ranked phylogenetic trees.估计排序系统发育树空间中的均值。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae514.

引用本文的文献

1
Psychopathological Implications of Behavioral Patterns in Obsessive-Compulsive Rituals: A Hierarchical Analysis.强迫仪式行为模式的精神病理学意义:一项层次分析
Brain Sci. 2025 May 23;15(6):552. doi: 10.3390/brainsci15060552.
2
A PANoptosis-Based Signature for Survival and Immune Predication in Glioblastoma Multiforme.一种基于PAN凋亡的多形性胶质母细胞瘤生存和免疫预测特征
Ann Clin Transl Neurol. 2025 Jul;12(7):1334-1349. doi: 10.1002/acn3.70066. Epub 2025 May 7.
3
Statistical summaries of unlabelled evolutionary trees.未标记进化树的统计摘要。
Biometrika. 2023 Apr 26;111(1):171-193. doi: 10.1093/biomet/asad025. eCollection 2024 Mar.
4
Analyzing microbial evolution through gene and genome phylogenies.通过基因和基因组系统发生分析微生物进化。
Biostatistics. 2024 Jul 1;25(3):786-800. doi: 10.1093/biostatistics/kxad025.
5
Testing for genetic mutation of seasonal influenza virus.季节性流感病毒基因突变检测。
J Appl Stat. 2021 Sep 29;50(1):1-18. doi: 10.1080/02664763.2021.1978955. eCollection 2023.
6
Robust Analysis of Phylogenetic Tree Space.系统发育树空间的稳健分析。
Syst Biol. 2022 Aug 10;71(5):1255-1270. doi: 10.1093/sysbio/syab100.
7
Prognostic Biomarkers on a Competitive Endogenous RNA Network Reveals Overall Survival in Triple-Negative Breast Cancer.竞争性内源性RNA网络上的预后生物标志物揭示三阴性乳腺癌的总生存期
Front Oncol. 2021 Jun 11;11:681946. doi: 10.3389/fonc.2021.681946. eCollection 2021.
8
Accelerated Diversification Explains the Exceptional Species Richness of Tropical Characoid Fishes.加速多样化解释了热带脂鲤鱼类非凡的物种丰富度。
Syst Biol. 2021 Dec 16;71(1):78-92. doi: 10.1093/sysbio/syab040.
9
Signature RNAS and related regulatory roles in type 1 diabetes mellitus based on competing endogenous RNA regulatory network analysis.基于竞争性内源性 RNA 调控网络分析的 1 型糖尿病中特征性 RNA 及相关调控作用。
BMC Med Genomics. 2021 May 18;14(1):133. doi: 10.1186/s12920-021-00931-0.
10
Distance metrics for ranked evolutionary trees.排序进化树的距离度量。
Proc Natl Acad Sci U S A. 2020 Nov 17;117(46):28876-28886. doi: 10.1073/pnas.1922851117. Epub 2020 Nov 2.

本文引用的文献

1
phangorn: phylogenetic analysis in R.phangorn:R 中的系统发育分析。
Bioinformatics. 2011 Feb 15;27(4):592-3. doi: 10.1093/bioinformatics/btq706. Epub 2010 Dec 17.
2
A fast algorithm for computing geodesic distances in tree space.一种用于计算树空间测地距离的快速算法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):2-13. doi: 10.1109/TCBB.2010.3.
3
R/BHC: fast Bayesian hierarchical clustering for microarray data.R/BHC:用于微阵列数据的快速贝叶斯层次聚类
BMC Bioinformatics. 2009 Aug 6;10:242. doi: 10.1186/1471-2105-10-242.
4
Phylogenetic detection of recombination with a Bayesian prior on the distance between trees.基于树间距离的贝叶斯先验对重组进行系统发育检测。
PLoS One. 2008 Jul 9;3(7):e2651. doi: 10.1371/journal.pone.0002651.
5
Phylogenetic MCMC algorithms are misleading on mixtures of trees.系统发育马尔可夫链蒙特卡罗算法在树的混合模型上具有误导性。
Science. 2005 Sep 30;309(5744):2207-9. doi: 10.1126/science.1115493.
6
Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle.基于最小进化原理的快速准确的系统发育重建算法。
J Comput Biol. 2002;9(5):687-705. doi: 10.1089/106652702761034136.
7
Statistics for phylogenetic trees.系统发育树的统计学
Theor Popul Biol. 2003 Feb;63(1):17-32. doi: 10.1016/s0040-5809(02)00005-9.
8
Four new mitochondrial genomes and the increased stability of evolutionary trees of mammals from improved taxon sampling.四个新的线粒体基因组以及通过改进分类群抽样提高了哺乳动物进化树的稳定性。
Mol Biol Evol. 2002 Dec;19(12):2060-70. doi: 10.1093/oxfordjournals.molbev.a004031.
9
Statistically based postprocessing of phylogenetic analysis by clustering.基于聚类的系统发育分析的统计后处理
Bioinformatics. 2002;18 Suppl 1:S285-93. doi: 10.1093/bioinformatics/18.suppl_1.s285.
10
MRBAYES: Bayesian inference of phylogenetic trees.MRBAYES:系统发育树的贝叶斯推断
Bioinformatics. 2001 Aug;17(8):754-5. doi: 10.1093/bioinformatics/17.8.754.