Suppr超能文献

迈克:一种用于构建系统发育树的超快、无需组装和无需对齐的方法。

MIKE: an ultrafast, assembly-, and alignment-free approach for phylogenetic tree construction.

机构信息

College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, Shanxi 030024, China.

National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China.

出版信息

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae154.

Abstract

MOTIVATION

Constructing a phylogenetic tree requires calculating the evolutionary distance between samples or species via large-scale resequencing data, a process that is both time-consuming and computationally demanding. Striking the right balance between accuracy and efficiency is a significant challenge.

RESULTS

To address this, we introduce a new algorithm, MIKE (MinHash-based k-mer algorithm). This algorithm is designed for the swift calculation of the Jaccard coefficient directly from raw sequencing reads and enables the construction of phylogenetic trees based on the resultant Jaccard coefficient. Simulation results highlight the superior speed of MIKE compared to existing state-of-the-art methods. We used MIKE to reconstruct a phylogenetic tree, incorporating 238 yeast, 303 Zea, 141 Ficus, 67 Oryza, and 43 Saccharum spontaneum samples. MIKE demonstrated accurate performance across varying evolutionary scales, reproductive modes, and ploidy levels, proving itself as a powerful tool for phylogenetic tree construction.

AVAILABILITY AND IMPLEMENTATION

MIKE is publicly available on Github at https://github.com/Argonum-Clever2/mike.git.

摘要

动机

构建系统发育树需要通过大规模重测序数据计算样本或物种之间的进化距离,这是一个既耗时又耗费计算资源的过程。在准确性和效率之间取得恰当的平衡是一个重大挑战。

结果

为了解决这个问题,我们引入了一种新的算法,MIKE(基于 MinHash 的 k-mer 算法)。该算法旨在从原始测序reads 中快速计算 Jaccard 系数,并能够基于所得的 Jaccard 系数构建系统发育树。模拟结果突出了 MIKE 相对于现有最先进方法的卓越速度。我们使用 MIKE 重建了一个系统发育树,其中包含 238 个酵母、303 个玉米、141 个榕属、67 个稻属和 43 个野生甘蔗样本。MIKE 在不同的进化尺度、生殖模式和倍性水平上表现出准确的性能,证明了它是构建系统发育树的有力工具。

可用性和实现

MIKE 可在 Github 上公开获取,网址为 https://github.com/Argonum-Clever2/mike.git。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ed9/10990684/fb88714f4b61/btae154f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验