Metalign：基于包含最小哈希的高效基于比对的宏基因组分析。

Metalign: efficient alignment-based metagenomic profiling via containment min hash.

机构信息

Department of Computer Science, University of California, Los Angeles, CA, 90095, USA.

Department of Computer Science, ETH Zurich, Rämistrasse 101, CH-8092, Zurich, Switzerland.

出版信息

Genome Biol. 2020 Sep 10;21(1):242. doi: 10.1186/s13059-020-02159-0.

DOI:10.1186/s13059-020-02159-0

PMID:32912225

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7488264/

Abstract

Metagenomic profiling, predicting the presence and relative abundances of microbes in a sample, is a critical first step in microbiome analysis. Alignment-based approaches are often considered accurate yet computationally infeasible. Here, we present a novel method, Metalign, that performs efficient and accurate alignment-based metagenomic profiling. We use a novel containment min hash approach to pre-filter the reference database prior to alignment and then process both uniquely aligned and multi-aligned reads to produce accurate abundance estimates. In performance evaluations on both real and simulated datasets, Metalign is the only method evaluated that maintained high performance and competitive running time across all datasets.

摘要

宏基因组分析是微生物组分析的关键第一步，可预测样本中微生物的存在和相对丰度。基于比对的方法通常被认为是准确的，但在计算上不可行。在这里，我们提出了一种新的方法 Metalign，它可以进行高效准确的基于比对的宏基因组分析。我们使用一种新的包含最小哈希方法在比对之前对参考数据库进行预过滤，然后处理唯一比对和多比对的reads 以产生准确的丰度估计。在真实和模拟数据集上的性能评估中，Metalign 是唯一一种在所有数据集上都保持高性能和竞争运行时间的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7d1/7488264/941705971f2b/13059_2020_2159_Fig1_HTML.jpg

相似文献

Metalign: efficient alignment-based metagenomic profiling via containment min hash.

Genome Biol. 2020 Sep 10;21(1):242. doi: 10.1186/s13059-020-02159-0.

An accurate and fast alignment-free method for profiling microbial communities.

J Bioinform Comput Biol. 2017 Jun;15(3):1740001. doi: 10.1142/S0219720017400017. Epub 2017 Mar 7.

MiCoP: microbial community profiling method for detecting viral and fungal organisms in metagenomic samples.

BMC Genomics. 2019 Jun 6;20(Suppl 5):423. doi: 10.1186/s12864-019-5699-9.

GRASPx: efficient homolog-search of short peptide metagenome database through simultaneous alignment and assembly.

BMC Bioinformatics. 2016 Aug 31;17 Suppl 8(Suppl 8):283. doi: 10.1186/s12859-016-1119-1.

Metagenomic abundance estimation and diagnostic testing on species level.

Nucleic Acids Res. 2013 Jan 7;41(1):e10. doi: 10.1093/nar/gks803. Epub 2012 Aug 31.

FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes.

Bioinformatics. 2011 Jun 15;27(12):1704-5. doi: 10.1093/bioinformatics/btr252. Epub 2011 Apr 19.

Kraken: ultrafast metagenomic sequence classification using exact alignments.

Genome Biol. 2014 Mar 3;15(3):R46. doi: 10.1186/gb-2014-15-3-r46.

TIPP: taxonomic identification and phylogenetic profiling.

Bioinformatics. 2014 Dec 15;30(24):3548-55. doi: 10.1093/bioinformatics/btu721. Epub 2014 Oct 29.

Comparative study of sequence aligners for detecting antibiotic resistance in bacterial metagenomes.

Lett Appl Microbiol. 2018 Mar;66(3):162-168. doi: 10.1111/lam.12842. Epub 2018 Feb 1.

Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.

Microbiome. 2017 Jan 25;5(1):11. doi: 10.1186/s40168-017-0233-2.

引用本文的文献

Analysis of metagenomic data.

Nat Rev Methods Primers. 2025;5. doi: 10.1038/s43586-024-00376-6. Epub 2025 Jan 23.

Citywide metagenomic surveillance of food centres reveals local microbial signatures and antibiotic resistance gene enrichment.

NPJ Antimicrob Resist. 2025 Jul 8;3(1):63. doi: 10.1038/s44259-025-00132-0.

CAMI Benchmarking Portal: online evaluation and ranking of metagenomic software.

Nucleic Acids Res. 2025 Jul 7;53(W1):W102-W109. doi: 10.1093/nar/gkaf369.

Taming large-scale genomic analyses via sparsified genomics.

Nat Commun. 2025 Jan 21;16(1):876. doi: 10.1038/s41467-024-55762-1.

A metagenomic approach to demystify the anaerobic digestion black box and achieve higher biogas yield: a review.

Front Microbiol. 2024 Oct 11;15:1437098. doi: 10.3389/fmicb.2024.1437098. eCollection 2024.

CAIM: coverage-based analysis for identification of microbiome.

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae424.

Long-read sequencing reveals extensive gut phageome structural variations driven by genetic exchange with bacterial hosts.

Sci Adv. 2024 Aug 16;10(33):eadn3316. doi: 10.1126/sciadv.adn3316. Epub 2024 Aug 14.

Sequencing-based analysis of microbiomes.

Nat Rev Genet. 2024 Dec;25(12):829-845. doi: 10.1038/s41576-024-00746-6. Epub 2024 Jun 25.

Enhancing insights into diseases through horizontal gene transfer event detection from gut microbiome.

Nucleic Acids Res. 2024 Aug 12;52(14):e61. doi: 10.1093/nar/gkae515.

A survey of k-mer methods and applications in bioinformatics.

Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.

本文引用的文献

Improved metagenomic analysis with Kraken 2.

Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.

Benchmarking Metagenomics Tools for Taxonomic Classification.

Cell. 2019 Aug 8;178(4):779-794. doi: 10.1016/j.cell.2019.07.010.

Systematic benchmarking of omics computational tools.

Nat Commun. 2019 Mar 27;10(1):1393. doi: 10.1038/s41467-019-09406-4.

Microbial abundance, activity and population genomic profiling with mOTUs2.

Nat Commun. 2019 Mar 4;10(1):1014. doi: 10.1038/s41467-019-08844-4.

Assessing taxonomic metagenome profilers with OPAL.

Genome Biol. 2019 Mar 4;20(1):51. doi: 10.1186/s13059-019-1646-y.

MetaBinG2: a fast and accurate metagenomic sequence classification system for samples with many unknown organisms.

Biol Direct. 2018 Aug 22;13(1):15. doi: 10.1186/s13062-018-0220-y.

Minimap2: pairwise alignment for nucleotide sequences.

Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.

Using pseudoalignment and base quality to accurately quantify microbial community composition.

PLoS Comput Biol. 2018 Apr 16;14(4):e1006096. doi: 10.1371/journal.pcbi.1006096. eCollection 2018 Apr.

Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

Nat Methods. 2017 Nov;14(11):1063-1071. doi: 10.1038/nmeth.4458. Epub 2017 Oct 2.

Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

Genome Biol. 2017 Sep 21;18(1):182. doi: 10.1186/s13059-017-1299-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Metalign：基于包含最小哈希的高效基于比对的宏基因组分析。

Metalign: efficient alignment-based metagenomic profiling via containment min hash.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献