Suppr超能文献

使用狐猴(Lemur)和磁体(Magnet)对长读长宏基因组数据集进行轻量级分类分析。

Lightweight taxonomic profiling of long-read metagenomic datasets with Lemur and Magnet.

作者信息

Sapoval Nicolae, Liu Yunxi, Curry Kristen D, Kille Bryce, Huang Wenyu, Kokroko Natalie, Nute Michael G, Tyshaieva Alona, Dilthey Alexander, Molloy Erin K, Treangen Todd J

机构信息

Department of Computer Science, Rice University, Houston, TX 77005, USA.

Department of Computer Science, University of Maryland, College Park, MD 20742, USA.

出版信息

bioRxiv. 2024 Aug 25:2024.06.01.596961. doi: 10.1101/2024.06.01.596961.

Abstract

The advent of long-read sequencing of microbiomes necessitates the development of new taxonomic profilers tailored to long-read shotgun metagenomic datasets. Here, we introduce Lemur and Magnet, a pair of tools optimized for lightweight and accurate taxonomic profiling for long-read shotgun metagenomic datasets. Lemur is a marker-gene-based method that leverages an EM algorithm to reduce false positive calls while preserving true positives; Magnet is a whole-genome read-mapping-based method that provides detailed presence and absence calls for bacterial genomes. We demonstrate that Lemur and Magnet can run in minutes to hours on a laptop with 32 GB of RAM, even for large inputs, a crucial feature given the portability of long-read sequencing machines. Furthermore, the marker gene database used by Lemur is only 4 GB and contains information from over 300,000 RefSeq genomes. Lemur and Magnet are open-source and available at https://github.com/treangenlab/lemur and https://github.com/treangenlab/magnet.

摘要

微生物群落长读长测序技术的出现,使得有必要开发专门针对长读长鸟枪法宏基因组数据集的新型分类分析工具。在此,我们介绍Lemur和Magnet这一对工具,它们针对长读长鸟枪法宏基因组数据集进行了优化,旨在实现轻量级且准确的分类分析。Lemur是一种基于标记基因的方法,它利用期望最大化(EM)算法减少假阳性结果,同时保留真阳性结果;Magnet是一种基于全基因组读段比对的方法,可提供细菌基因组详细的存在与否判定。我们证明,即使处理大输入量数据,Lemur和Magnet在配备32GB内存的笔记本电脑上运行只需几分钟到几小时,鉴于长读长测序仪的便携性,这是一个关键特性。此外,Lemur使用的标记基因数据库仅4GB,包含来自超过300,000个RefSeq基因组的信息。Lemur和Magnet是开源的,可在https://github.com/treangenlab/lemur和https://github.com/treangenlab/magnet获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18d4/11346107/b10f04c6be88/nihpp-2024.06.01.596961v2-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验