Suppr超能文献

KPop:通过序列嵌入对微生物基因组进行准确且可扩展的比较分析。

KPop: accurate and scalable comparative analysis of microbial genomes by sequence embeddings.

作者信息

Didelot Xavier, Ribeca Paolo

机构信息

School of Life Sciences and Department of Statistics, University of Warwick, Coventry, UK.

NIHR Health Protection Research Unit in Genomics and Enabling Data, University of Warwick, Coventry, UK.

出版信息

Genome Biol. 2025 Jun 18;26(1):170. doi: 10.1186/s13059-025-03585-8.

Abstract

Here we introduce KPop, a novel versatile method based on full k-mer spectra and dataset-specific transformations, through which thousands of assembled or unassembled microbial genomes can be quickly compared. Unlike MinHash-based methods that produce distances and have lower resolution, KPop is able to accurately map sequences onto a low-dimensional space. Extensive validation on simulated and real-life viral and bacterial datasets shows that KPop can correctly separate sequences at both species and sub-species levels even when the overall genomic diversity is low. KPop also rapidly identifies related sequences and systematically outperforms MinHash-based methods.

摘要

在此,我们介绍KPop,这是一种基于完整k-mer谱和特定数据集转换的新型通用方法,通过该方法可以快速比较数千个已组装或未组装的微生物基因组。与基于MinHash的方法不同,后者产生距离且分辨率较低,KPop能够将序列准确地映射到低维空间。对模拟和真实病毒及细菌数据集的广泛验证表明,即使总体基因组多样性较低,KPop也能在物种和亚种水平上正确分离序列。KPop还能快速识别相关序列,并在系统性能上优于基于MinHash的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dad4/12175428/75d4fa59ee86/13059_2025_3585_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验