Suppr超能文献

利用Sylph进行快速的物种水平宏基因组分析和含量估计。

Rapid species-level metagenome profiling and containment estimation with sylph.

作者信息

Shaw Jim, Yu Yun William

机构信息

Department of Mathematics, University of Toronto, Toronto, Ontario, Canada.

Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA.

出版信息

Nat Biotechnol. 2024 Oct 8. doi: 10.1038/s41587-024-02412-y.

Abstract

Profiling metagenomes against databases allows for the detection and quantification of microorganisms, even at low abundances where assembly is not possible. We introduce sylph, a species-level metagenome profiler that estimates genome-to-metagenome containment average nucleotide identity (ANI) through zero-inflated Poisson k-mer statistics, enabling ANI-based taxa detection. On the Critical Assessment of Metagenome Interpretation II (CAMI2) Marine dataset, sylph was the most accurate profiling method of seven tested. For multisample profiling, sylph took >10-fold less central processing unit time compared to Kraken2 and used 30-fold less memory. Sylph's ANI estimates provided an orthogonal signal to abundance, allowing for an ANI-based metagenome-wide association study for Parkinson disease (PD) against 289,232 genomes while confirming known butyrate-PD associations at the strain level. Sylph took <1 min and 16 GB of random-access memory to profile metagenomes against 85,205 prokaryotic and 2,917,516 viral genomes, detecting 30-fold more viral sequences in the human gut compared to RefSeq. Sylph offers precise, efficient profiling with accurate containment ANI estimation even for low-coverage genomes.

摘要

将宏基因组与数据库进行比对能够检测和定量微生物,即使在丰度较低且无法进行组装的情况下也是如此。我们引入了Sylph,这是一种物种水平的宏基因组分析工具,它通过零膨胀泊松k-mer统计来估计基因组与宏基因组的包含平均核苷酸同一性(ANI),从而实现基于ANI的分类群检测。在宏基因组解释关键评估II(CAMI2)海洋数据集上,Sylph是七种测试方法中最准确的分析方法。对于多样本分析,与Kraken2相比,Sylph的中央处理器时间减少了10倍以上,内存使用量减少了30倍。Sylph的ANI估计为丰度提供了一个正交信号,从而能够针对帕金森病(PD)开展一项基于ANI的全宏基因组关联研究,该研究涉及289,232个基因组,同时在菌株水平上证实了已知的丁酸盐与PD的关联。Sylph在使用16GB随机存取内存的情况下,不到1分钟就能完成针对85,205个原核生物基因组和2,917,516个病毒基因组的宏基因组分析,与RefSeq相比,在人类肠道中检测到的病毒序列多出30倍。即使对于低覆盖度基因组,Sylph也能提供精确、高效的分析以及准确的包含ANI估计。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验