使用 MetaMaps 对长读进行菌株水平宏基因组分配和组成估计。

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

机构信息

Institute of Medical Microbiology and Hospital Hygiene, Heinrich-Heine-University Düsseldorf, Düsseldorf, North Rhine-Westphalia, Germany.

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, 20892, USA.

出版信息

Nat Commun. 2019 Jul 11;10(1):3066. doi: 10.1038/s41467-019-10934-2.

DOI:10.1038/s41467-019-10934-2

PMID:31296857

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6624308/

Abstract

Metagenomic sequence classification should be fast, accurate and information-rich. Emerging long-read sequencing technologies promise to improve the balance between these factors but most existing methods were designed for short reads. MetaMaps is a new method, specifically developed for long reads, capable of mapping a long-read metagenome to a comprehensive RefSeq database with >12,000 genomes in <16 GB or RAM on a laptop computer. Integrating approximate mapping with probabilistic scoring and EM-based estimation of sample composition, MetaMaps achieves >94% accuracy for species-level read assignment and r > 0.97 for the estimation of sample composition on both simulated and real data when the sample genomes or close relatives are present in the classification database. To address novel species and genera, which are comparatively harder to predict, MetaMaps outputs mapping locations and qualities for all classified reads, enabling functional studies (e.g. gene presence/absence) and detection of incongruities between sample and reference genomes.

摘要

宏基因组序列分类应快速、准确且信息丰富。新兴的长读测序技术有望改善这些因素之间的平衡，但大多数现有方法都是为短读长设计的。MetaMaps 是一种新的方法，专门为长读长设计，能够在笔记本电脑上的 <16GB 或 RAM 中，将长读长宏基因组映射到包含 >12000 个基因组的综合 RefSeq 数据库中。MetaMaps 将近似映射与概率评分和基于 EM 的样本组成估计相结合，在分类数据库中存在样本基因组或近亲时，实现了 >94%的物种级读分配准确率和 r > 0.97 的样本组成估计准确率。为了解决新型物种和属，这些物种和属相对较难预测，MetaMaps 为所有分类读长输出映射位置和质量，从而能够进行功能研究（例如基因存在/缺失）和检测样本与参考基因组之间的不一致性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56af/6624308/ae4842d34eea/41467_2019_10934_Fig1_HTML.jpg

相似文献

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.使用 MetaMaps 对长读进行菌株水平宏基因组分配和组成估计。

Nat Commun. 2019 Jul 11;10(1):3066. doi: 10.1038/s41467-019-10934-2.

A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。

Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.

MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序：一种合成方法。

Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.

Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.基于真实和模拟宏基因组序列混合读取的宏基因组组装器评估。

Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.

Exploiting topic modeling to boost metagenomic reads binning.利用主题建模来促进宏基因组读数分箱。

BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-16-S5-S2. Epub 2015 Mar 18.

Species classifier choice is a key consideration when analysing low-complexity food microbiome data.在分析低复杂度食品微生物组数据时，物种分类器的选择是一个关键考虑因素。

Microbiome. 2018 Mar 20;6(1):50. doi: 10.1186/s40168-018-0437-0.

Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.快速而简单的基于蛋白质比对的微生物组测序读段中直系同源基因家族组装方法。

Microbiome. 2017 Jan 25;5(1):11. doi: 10.1186/s40168-017-0233-2.

Mora: abundance aware metagenomic read re-assignment for disentangling similar strains.莫拉：用于区分相似菌株的丰度感知宏基因组读数重新分配法

BMC Bioinformatics. 2024 Apr 23;25(1):161. doi: 10.1186/s12859-024-05768-9.

Unveiling microbial diversity: harnessing long-read sequencing technology.揭示微生物多样性：利用长读长测序技术

Nat Methods. 2024 Jun;21(6):954-966. doi: 10.1038/s41592-024-02262-1. Epub 2024 Apr 30.

Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets.评价长读 shotgun 宏基因组测序数据集的分类和分析方法。

BMC Bioinformatics. 2022 Dec 13;23(1):541. doi: 10.1186/s12859-022-05103-0.

引用本文的文献

CLASV: Rapid Lassa virus lineage assignment with random forest.CLASV：利用随机森林进行快速拉沙病毒谱系分类

PLoS Negl Trop Dis. 2025 Sep 9;19(9):e0013512. doi: 10.1371/journal.pntd.0013512. eCollection 2025 Sep.

Challenges and Opportunities in Analyzing Cancer-Associated Microbiomes.分析癌症相关微生物群的挑战与机遇

Cancer Res. 2025 Aug 12. doi: 10.1158/0008-5472.CAN-24-3629.

Long-Read Sequencing for the Rapid Response to Infectious Diseases Outbreaks.用于传染病爆发快速响应的长读长测序

Curr Clin Microbiol Rep. 2025;12(1):10. doi: 10.1007/s40588-025-00247-y. Epub 2025 May 15.

Bioinformatic approaches to blood and tissue microbiome analyses: challenges and perspectives.血液和组织微生物组分析的生物信息学方法：挑战与展望。

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf176.

Species-resolved profiling of antibiotic resistance genes in complex metagenomes through long-read overlapping with Argo.通过与Argo的长读长重叠对复杂宏基因组中的抗生素抗性基因进行物种解析分析。

Nat Commun. 2025 Feb 18;16(1):1744. doi: 10.1038/s41467-025-57088-y.

Determining the incidence, risk factors and biological drivers of irritable bowel syndrome (IBS) as part of the constellation of postacute sequelae of SARS-CoV-2 infection (PASC) outcomes in the Arizona CoVHORT-GI: a longitudinal cohort study.在亚利桑那州新冠病毒感染后胃肠道后遗症（PASC）结局研究队列（Arizona CoVHORT-GI）中，确定肠易激综合征（IBS）作为严重急性呼吸综合征冠状病毒2（SARS-CoV-2）感染后急性后遗症（PASC）综合征一部分的发病率、风险因素和生物学驱动因素：一项纵向队列研究。

BMJ Open. 2025 Jan 30;15(1):e095093. doi: 10.1136/bmjopen-2024-095093.

Repeat and haplotype aware error correction in nanopore sequencing reads with DeChat.使用DeChat对纳米孔测序读数进行重复和单倍型感知错误校正。

Commun Biol. 2024 Dec 19;7(1):1678. doi: 10.1038/s42003-024-07376-y.

When less is more: sketching with minimizers in genomics.少即是多：基因组学中的最小化器草图。

Genome Biol. 2024 Oct 14;25(1):270. doi: 10.1186/s13059-024-03414-4.

MNBC: a multithreaded Minimizer-based Naïve Bayes Classifier for improved metagenomic sequence classification.MNBC：一种基于多线程 Minimizer 的朴素贝叶斯分类器，用于改进宏基因组序列分类。

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae601.

Taxometer: Improving taxonomic classification of metagenomics contigs.Taxometer：提高宏基因组序列的分类学分类。

Nat Commun. 2024 Sep 27;15(1):8357. doi: 10.1038/s41467-024-52771-y.

本文引用的文献

Ultra-deep, long-read nanopore sequencing of mock microbial community standards.超深度、长读长纳米孔测序模拟微生物群落标准品。

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz043.

RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification.RefSeq 数据库的增长影响了基于 k-mer 的最低共同祖先物种鉴定的准确性。

Genome Biol. 2018 Oct 30;19(1):165. doi: 10.1186/s13059-018-1554-6.

A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.一种将长读段映射到大型参考数据库的快速近似算法。

J Comput Biol. 2018 Jul;25(7):766-779. doi: 10.1089/cmb.2018.0036. Epub 2018 Apr 30.

MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.MEGAN-LR：新算法允许对宏基因组长读段和 contigs 进行准确的分箱和轻松的交互式探索。

Biol Direct. 2018 Apr 20;13(1):6. doi: 10.1186/s13062-018-0208-7.

Field Sequencing and Life Detection in Remote (79°26'N) Canadian High Arctic Permafrost Ice Wedge Microbial Communities.加拿大北极地区偏远（北纬79°26′）永久冻土冰楔微生物群落的现场测序与生命探测

Front Microbiol. 2017 Dec 20;8:2594. doi: 10.3389/fmicb.2017.02594. eCollection 2017.

A novel data structure to support ultra-fast taxonomic classification of metagenomic sequences with k-mer signatures.一种新的数据结构，用于支持基于 k-mer 特征的宏基因组序列的超快速分类学分类。

Bioinformatics. 2018 Jan 1;34(1):171-178. doi: 10.1093/bioinformatics/btx432.

Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.宏基因组解读的批判性评估——宏基因组学软件的一项基准测试

Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.

Shotgun metagenomics, from sampling to analysis. shotgun 宏基因组学，从采样到分析。

Nat Biotechnol. 2017 Sep 12;35(9):833-844. doi: 10.1038/nbt.3935.

Abundance estimation and differential testing on strain level in metagenomics data.宏基因组数据中菌株水平的丰度估计和差异检验。

Bioinformatics. 2017 Jul 15;33(14):i124-i132. doi: 10.1093/bioinformatics/btx237.

Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper.通过eggNOG-Mapper进行直系同源物分配实现全基因组快速功能注释

Mol Biol Evol. 2017 Aug 1;34(8):2115-2122. doi: 10.1093/molbev/msx148.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用 MetaMaps 对长读进行菌株水平宏基因组分配和组成估计。

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献