College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Jinzhong 030600, China.
Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518055, China.
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae566.
Sketching technologies have recently emerged as a promising solution for real-time, large-scale phylogenetic analysis. However, existing sketching-based phylogenetic tools exhibit drawbacks, including platform restrictions, deficiencies in tree visualization, and inherent distance estimation bias. These limitations collectively impede the overall convenience and efficiency of the analysis. In this study, we introduce Kssdtree, an interactive Python package designed to address these challenges. Kssdtree surpasses other sketching-based tools by demonstrating superior performance in terms of both accuracy and time efficiency on comprehensive benchmarking datasets. Notably, Kssdtree offers key advantages such as intra-species phylogenomic analysis and GTDB-based phylogenetic placement analysis, significantly enhancing the scope and depth of phylogenetic investigations. Through extensive evaluations and comparisons, Kssdtree stands out as an efficient and versatile method for real-time, large-scale phylogenetic analysis.
The Kssdtree Python package is freely accessible at https://pypi.org/project/kssdtree and source code is available at https://github.com/yhlink/kssdtree. The documentation and instantiation for the software is available at https://kssdtree.readthedocs.io/en/latest. The video tutorial is available at https://youtu.be/_6hg59Yn-Ws.
草图技术最近成为实时、大规模系统发育分析的一种很有前途的解决方案。然而,现有的基于草图的系统发育工具存在一些缺陷,包括平台限制、树可视化的不足以及固有的距离估计偏差。这些限制共同阻碍了分析的整体便利性和效率。在本研究中,我们引入了 Kssdtree,这是一个交互式的 Python 包,旨在解决这些挑战。Kssdtree 在综合基准数据集上的准确性和时间效率方面都表现出了卓越的性能,超越了其他基于草图的工具。值得注意的是,Kssdtree 提供了一些关键优势,如种内基因组系统发育分析和 GTDB 基系统发育定位分析,显著扩展了系统发育研究的范围和深度。通过广泛的评估和比较,Kssdtree 是一种用于实时、大规模系统发育分析的高效、通用的方法。
Kssdtree Python 包可在 https://pypi.org/project/kssdtree 上免费获取,源代码可在 https://github.com/yhlink/kssdtree 上获取。该软件的文档和实例化可在 https://kssdtree.readthedocs.io/en/latest 上获取。视频教程可在 https://youtu.be/_6hg59Yn-Ws 上观看。