Suppr超能文献

在不同但相互重叠的分类单元集上定义的系统发育树的比较:综述。

Comparison of phylogenetic trees defined on different but mutually overlapping sets of taxa: A review.

作者信息

Li Wanlin, Koshkarov Aleksandr, Tahiri Nadia

机构信息

Department of Computer Science University of Sherbrooke Sherbrooke Quebec Canada.

出版信息

Ecol Evol. 2024 Aug 8;14(8):e70054. doi: 10.1002/ece3.70054. eCollection 2024 Aug.

Abstract

Phylogenetic trees represent the evolutionary relationships and ancestry of various species or groups of organisms. Comparing these trees by measuring the distance between them is essential for applications such as tree clustering and the Tree of Life project. Many distance metrics for phylogenetic trees focus on trees defined on the same set of taxa. However, some problems require calculating distances between trees with different but overlapping sets of taxa. This study reviews state-of-the-art distance measures for such trees, covering six major approaches, including the constraint-based Robinson-Foulds (RF) distance RF(-), the completion-based RF(+), the generalized RF (GRF), the dissimilarity measure, the vectorial tree distance, and the geodesic distance in the extended Billera-Holmes-Vogtmann tree space. Among these, three RF-based methods, RF(-), RF(+), and GRF, were examined in detail on generated clusters of phylogenetic trees defined on different but mutually overlapping sets of taxa. Additionally, we reviewed nine related techniques, including leaf imputation methods, the tree edit distance, and visual comparison. A comparison of the related distance measures, highlighting their principal advantages and shortcomings, is provided. This review offers valuable insights into their applicability and performance, guiding the appropriate use of these metrics based on tree type (rooted or unrooted) and information type (topological or branch lengths).

摘要

系统发育树代表了各种物种或生物群体的进化关系和祖先。通过测量它们之间的距离来比较这些树对于诸如树聚类和生命之树项目等应用至关重要。许多系统发育树的距离度量都集中在定义在同一分类单元集上的树。然而,一些问题需要计算具有不同但重叠分类单元集的树之间的距离。本研究回顾了针对此类树的最新距离度量方法,涵盖六种主要方法,包括基于约束的罗宾逊-福尔兹(RF)距离RF(-)、基于完备化的RF(+)、广义RF(GRF)、差异度量、向量树距离以及扩展的比勒拉-霍姆斯-沃格特曼树空间中的测地距离。其中,基于RF的三种方法,RF(-)、RF(+)和GRF,在定义于不同但相互重叠的分类单元集上生成的系统发育树聚类上进行了详细研究。此外,我们还回顾了九种相关技术,包括叶插补方法、树编辑距离和可视化比较。提供了相关距离度量的比较,突出了它们的主要优点和缺点。本综述为它们的适用性和性能提供了有价值的见解,根据树的类型(有根或无根)和信息类型(拓扑或分支长度)指导这些度量的适当使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cce/11307105/9b50ea902a52/ECE3-14-e70054-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验