Suppr超能文献

GPTree 聚类:超级树推断背景下的系统发育树聚类生成器。

GPTree Cluster: phylogenetic tree cluster generator in the context of supertree inference.

作者信息

Koshkarov Aleksandr, Tahiri Nadia

机构信息

Department of Computer Science, University of Sherbrooke, 2500, Boulevard de l'Université, Sherbrooke, Québec J1K 2R1, Canada.

Department of Computer Science, University of Sherbrooke, 2500, Boulevard de l'Université, Sherbrooke, Québec J1K 2R1,

出版信息

Bioinform Adv. 2023 Mar 3;3(1):vbad023. doi: 10.1093/bioadv/vbad023. eCollection 2023.

Abstract

SUMMARY

For many years, evolutionary and molecular biologists have been working with phylogenetic supertrees, which are oriented acyclic graph structures. In the standard approaches, supertrees are obtained by concatenating a set of phylogenetic trees defined on different but overlapping sets of taxa (i.e. species). More recent approaches propose alternative solutions for supertree inference. The testing of new metrics for comparing supertrees and adapting clustering algorithms to overlapping phylogenetic trees with different numbers of leaves requires large amounts of data. In this context, designing a new approach and developing a computer program to generate phylogenetic tree clusters with different numbers of overlapping leaves are key elements to advance research on phylogenetic supertrees and evolution. The main objective of the project is to propose a new approach to simulate clusters of phylogenetic trees defined on different, but mutually overlapping, sets of taxa, with biological events. The proposed generator can be used to generate a certain number of clusters of phylogenetic trees in Newick format with a variable number of leaves and with a defined level of overlap between trees in clusters.

AVAILABILITY AND IMPLEMENTATION

A Python script version 3.7, called GPTree Cluster, which implements the discussed approach, is freely available at: https://github.com/tahiri-lab/GPTree/tree/GPTreeCluster.

摘要

摘要

多年来,进化生物学家和分子生物学家一直在研究系统发育超级树,它是有向无环图结构。在标准方法中,超级树是通过拼接一组定义在不同但重叠的分类单元(即物种)集合上的系统发育树来获得的。最近的方法提出了超级树推断的替代解决方案。测试用于比较超级树的新指标以及使聚类算法适用于具有不同叶数的重叠系统发育树需要大量数据。在这种情况下,设计一种新方法并开发一个计算机程序来生成具有不同数量重叠叶的系统发育树聚类是推进系统发育超级树和进化研究的关键要素。该项目的主要目标是提出一种新方法,用于模拟定义在不同但相互重叠的分类单元集合上的系统发育树聚类,并考虑生物事件。所提出的生成器可用于生成一定数量的Newick格式的系统发育树聚类,这些聚类具有可变数量的叶,并且聚类中的树之间具有定义的重叠程度。

可用性和实现

一个名为GPTree Cluster的Python 3.7脚本实现了所讨论的方法,可在以下网址免费获取:https://github.com/tahiri-lab/GPTree/tree/GPTreeCluster。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/560f/10089678/3718b245b6e9/vbad023f1.jpg

相似文献

1
GPTree Cluster: phylogenetic tree cluster generator in the context of supertree inference.
Bioinform Adv. 2023 Mar 3;3(1):vbad023. doi: 10.1093/bioadv/vbad023. eCollection 2023.
2
Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.
J Bioinform Comput Biol. 2022 Aug;20(4):2250012. doi: 10.1142/S0219720022500123. Epub 2022 Jul 6.
3
Robinson-Foulds supertrees.
Algorithms Mol Biol. 2010 Feb 24;5:18. doi: 10.1186/1748-7188-5-18.
4
PhySIC_IST: cleaning source trees to infer more informative supertrees.
BMC Bioinformatics. 2008 Oct 4;9:413. doi: 10.1186/1471-2105-9-413.
5
Building alternative consensus trees and supertrees using k-means and Robinson and Foulds distance.
Bioinformatics. 2022 Jun 27;38(13):3367-3376. doi: 10.1093/bioinformatics/btac326.
6
Comparison of phylogenetic trees defined on different but mutually overlapping sets of taxa: A review.
Ecol Evol. 2024 Aug 8;14(8):e70054. doi: 10.1002/ece3.70054. eCollection 2024 Aug.
7
COSPEDTree: COuplet Supertree by Equivalence Partitioning of Taxa Set and DAG Formation.
IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):590-603. doi: 10.1109/TCBB.2014.2366778.
8
L.U.St: a tool for approximated maximum likelihood supertree reconstruction.
BMC Bioinformatics. 2014 Jun 12;15:183. doi: 10.1186/1471-2105-15-183.
9
Performance of flip supertree construction with a heuristic algorithm.
Syst Biol. 2004 Apr;53(2):299-308. doi: 10.1080/10635150490423719.

引用本文的文献

1
Comparison of phylogenetic trees defined on different but mutually overlapping sets of taxa: A review.
Ecol Evol. 2024 Aug 8;14(8):e70054. doi: 10.1002/ece3.70054. eCollection 2024 Aug.

本文引用的文献

1
Invariant transformers of Robinson and Foulds distance matrices for Convolutional Neural Network.
J Bioinform Comput Biol. 2022 Aug;20(4):2250012. doi: 10.1142/S0219720022500123. Epub 2022 Jul 6.
2
Building alternative consensus trees and supertrees using k-means and Robinson and Foulds distance.
Bioinformatics. 2022 Jun 27;38(13):3367-3376. doi: 10.1093/bioinformatics/btac326.
3
Zombi: a phylogenetic simulator of trees, genomes and sequences that accounts for dead linages.
Bioinformatics. 2020 Feb 15;36(4):1286-1288. doi: 10.1093/bioinformatics/btz710.
4
SaGePhy: an improved phylogenetic simulation framework for gene and subgene evolution.
Bioinformatics. 2019 Sep 15;35(18):3496-3498. doi: 10.1093/bioinformatics/btz081.
5
A new fast method for inferring multiple consensus trees using k-medoids.
BMC Evol Biol. 2018 Apr 5;18(1):48. doi: 10.1186/s12862-018-1163-8.
6
Horizontal gene transfer constrains the timing of methanogen evolution.
Nat Ecol Evol. 2018 May;2(5):897-903. doi: 10.1038/s41559-018-0513-7. Epub 2018 Apr 2.
7
Efficient comparative phylogenetics on large trees.
Bioinformatics. 2018 Mar 15;34(6):1053-1055. doi: 10.1093/bioinformatics/btx701.
8
Simulating and Summarizing Sources of Gene Tree Incongruence.
Genome Biol Evol. 2016 May 9;8(5):1299-315. doi: 10.1093/gbe/evw065.
9
ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data.
Mol Biol Evol. 2016 Jun;33(6):1635-8. doi: 10.1093/molbev/msw046. Epub 2016 Feb 26.
10
SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees.
Syst Biol. 2016 Mar;65(2):334-44. doi: 10.1093/sysbio/syv082. Epub 2015 Nov 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验