在不使用参考基因组的情况下，对复杂宏基因组样本中的基因组和遗传元件进行鉴定和组装。

Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

机构信息

1] Center for Biological Sequence Analysis, Technical University of Denmark, Kongens Lyngby, Denmark. [2] Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark. [3].

1] INRA, Institut National de la Recherche Agronomique, UMR 14121 MICALIS, Jouy en Josas, France. [2] INRA, Institut National de la Recherche Agronomique, US 1367 Metagenopolis, Jouy en Josas, France. [3] Department of Computer Science, Center for Bioinformatics and Computational Biology, University of Maryland, USA. [4].

出版信息

Nat Biotechnol. 2014 Aug;32(8):822-8. doi: 10.1038/nbt.2939. Epub 2014 Jul 6.

DOI:10.1038/nbt.2939

PMID:24997787

Abstract

Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.

摘要

大多数当前用于分析宏基因组数据的方法都依赖于与参考基因组的比较，但许多环境中的微生物多样性远远超出了参考数据库所涵盖的范围。将复杂的宏基因组数据从头开始分离成特定的生物实体，例如特定的细菌株或病毒，仍然是一个尚未解决的大问题。在这里，我们提出了一种基于在一系列宏基因组样本中对共丰度基因进行分类的方法，该方法能够全面发现新的微生物生物、病毒和共遗传的遗传实体，并有助于在无需参考序列的情况下组装微生物基因组。我们在来自 396 个人类肠道微生物组样本的数据上验证了该方法，并鉴定了 7381 个共丰度基因组 (CAG)，包括 741 个宏基因组物种 (MGS)。我们使用这些来组装 238 个高质量的微生物基因组，并确定 MGS 与数百种病毒或遗传实体之间的关联。我们的方法为全面分析复杂宏基因组样本中的多样性提供了手段。

相似文献

Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.在不使用参考基因组的情况下，对复杂宏基因组样本中的基因组和遗传元件进行鉴定和组装。

Nat Biotechnol. 2014 Aug;32(8):822-8. doi: 10.1038/nbt.2939. Epub 2014 Jul 6.

MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data.MSPminer：基于丰度的宏基因组数据中微生物泛基因组重建。

Bioinformatics. 2019 May 1;35(9):1544-1552. doi: 10.1093/bioinformatics/bty830.

Clustering co-abundant genes identifies components of the gut microbiome that are reproducibly associated with colorectal cancer and inflammatory bowel disease.聚类共丰度基因可识别与结直肠癌和炎症性肠病有重现性关联的肠道微生物组的组成部分。

Microbiome. 2019 Aug 1;7(1):110. doi: 10.1186/s40168-019-0722-6.

ReprDB and panDB: minimalist databases with maximal microbial representation.ReprDB 和 panDB：具有最大微生物代表性的极简主义数据库。

Microbiome. 2018 Jan 18;6(1):15. doi: 10.1186/s40168-018-0399-2.

Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.通过验证的视角看宏基因组组装：评估和提高宏基因组组装基因组质量的最新进展。

Brief Bioinform. 2019 Jul 19;20(4):1140-1150. doi: 10.1093/bib/bbx098.

A scalable assembly-free variable selection algorithm for biomarker discovery from metagenomes.一种用于从宏基因组中发现生物标志物的可扩展无组装变量选择算法。

BMC Bioinformatics. 2016 Aug 19;17(1):311. doi: 10.1186/s12859-016-1186-3.

Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.通过利用基因组特征和标记基因信息对序列进行自动聚类，实现宏基因组重叠群的精确分类。

Sci Rep. 2016 Apr 12;6:24175. doi: 10.1038/srep24175.

Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.评估宏基因组工具在真实宏基因组数据集和 CAMI 数据集上的基因组 binning 效果。

BMC Bioinformatics. 2020 Jul 28;21(1):334. doi: 10.1186/s12859-020-03667-3.

Adversarial and variational autoencoders improve metagenomic binning.对抗自编码器和变分自编码器改进了宏基因组分箱。

Commun Biol. 2023 Oct 21;6(1):1073. doi: 10.1038/s42003-023-05452-3.

Metagenomic Assembly: Reconstructing Genomes from Metagenomes.宏基因组组装：从宏基因组中重建基因组。

Methods Mol Biol. 2021;2242:139-152. doi: 10.1007/978-1-0716-1099-2_9.

引用本文的文献

Discovery of a widespread chemical signalling pathway in the Bacteroidota.在拟杆菌门中发现一种广泛存在的化学信号传导途径。

Nature. 2025 Aug 20. doi: 10.1038/s41586-025-09418-9.

Histamine Metabolism in IBD: Towards Precision Nutrition.炎症性肠病中的组胺代谢：迈向精准营养

Nutrients. 2025 Jul 29;17(15):2473. doi: 10.3390/nu17152473.

Unlocking the potential of CRISPR tools and databases for precision genome editing.释放CRISPR工具和数据库在精准基因组编辑方面的潜力。

Front Plant Sci. 2025 Jul 21;16:1563711. doi: 10.3389/fpls.2025.1563711. eCollection 2025.

Metagenomics-Metabolomics Reveals the Alleviation of Indole-3-Ethanol on Radiation-Induced Enteritis in Mice.宏基因组学-代谢组学揭示吲哚-3-乙醇对小鼠辐射诱导肠炎的缓解作用

J Microbiol Biotechnol. 2025 Jul 18;35:e2502037. doi: 10.4014/jmb.2502.02037.

Jian-Pi-Yi-Shen formula improves kidney function by regulating gut microbiome in rats with chronic kidney disease.健脾益肾方通过调节慢性肾脏病大鼠肠道微生物群来改善肾功能。

Front Cell Infect Microbiol. 2025 Jul 9;15:1526863. doi: 10.3389/fcimb.2025.1526863. eCollection 2025.

Large-scale classification of metagenomic samples: a comparative analysis of classical machine learning techniques vs a novel brain-inspired hyperdimensional computing approach.宏基因组样本的大规模分类：经典机器学习技术与新型脑启发式高维计算方法的比较分析

bioRxiv. 2025 Jul 7:2025.07.06.663394. doi: 10.1101/2025.07.06.663394.

Impacts of Captive Domestication and Geographical Divergence on the Gut Microbiome of Endangered Forest Musk Deer.圈养驯化和地理分化对濒危林麝肠道微生物群的影响

Animals (Basel). 2025 Jul 2;15(13):1954. doi: 10.3390/ani15131954.

Benchmarking and optimizing qualitative and quantitative pipelines in environmental metatranscriptomics using mixture controlling experiments.利用混合控制实验对环境宏转录组学中的定性和定量流程进行基准测试与优化。

ISME Commun. 2025 May 29;5(1):ycaf090. doi: 10.1093/ismeco/ycaf090. eCollection 2025 Jan.

Genome-resolved metagenomics from short-read sequencing data in the era of artificial intelligence.人工智能时代基于短读长测序数据的基因组解析宏基因组学

Funct Integr Genomics. 2025 Jun 10;25(1):124. doi: 10.1007/s10142-025-01625-x.

Integrative cross-tissue analysis unveils complement-immunoglobulin augmentation and dysbiosis-related fatty acid metabolic remodeling during mammalian aging.整合性跨组织分析揭示了哺乳动物衰老过程中补体-免疫球蛋白增加以及与生态失调相关的脂肪酸代谢重塑。

Imeta. 2025 Apr 12;4(3):e70027. doi: 10.1002/imt2.70027. eCollection 2025 Jun.

本文引用的文献

Bayesian data analysis.贝叶斯数据分析。

Wiley Interdiscip Rev Cogn Sci. 2010 Sep;1(5):658-676. doi: 10.1002/wcs.72. Epub 2010 Apr 28.

Richness of human gut microbiome correlates with metabolic markers.人类肠道微生物组的丰富度与代谢标志物相关。

Nature. 2013 Aug 29;500(7464):541-6. doi: 10.1038/nature12506.

Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes.通过对多个宏基因组进行差异覆盖分箱获得的稀有未培养细菌的基因组序列。

Nat Biotechnol. 2013 Jun;31(6):533-8. doi: 10.1038/nbt.2579. Epub 2013 May 26.

CRISPR-Cas systems target a diverse collection of invasive mobile genetic elements in human microbiomes.CRISPR-Cas系统靶向人类微生物群中各种各样的侵入性移动遗传元件。

Genome Biol. 2013 Apr 29;14(4):R40. doi: 10.1186/gb-2013-14-4-r40.

Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome.利用凝胶微滴培养技术完成的近全基因组序列揭示了人类微生物组内种内基因组的巨大多样性。

Genome Res. 2013 May;23(5):878-88. doi: 10.1101/gr.142208.112. Epub 2013 Mar 14.

MOCAT: a metagenomics assembly and gene prediction toolkit.MOCAT：一个宏基因组组装和基因预测工具包。

PLoS One. 2012;7(10):e47656. doi: 10.1371/journal.pone.0047656. Epub 2012 Oct 17.

A metagenome-wide association study of gut microbiota in type 2 diabetes.2 型糖尿病患者肠道微生物组的宏基因组关联研究。

Nature. 2012 Oct 4;490(7418):55-60. doi: 10.1038/nature11450. Epub 2012 Sep 26.

MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample.MetaCluster 5.0：一种针对嘈杂样本中低丰度物种的元基因组数据的两阶段分箱方法。

Bioinformatics. 2012 Sep 15;28(18):i356-i362. doi: 10.1093/bioinformatics/bts397.

The enemy within us: lessons from the 2011 European Escherichia coli O104:H4 outbreak.我们体内的敌人：2011 年欧洲大肠杆菌 O104:H4 暴发的教训。

EMBO Mol Med. 2012 Sep;4(9):841-8. doi: 10.1002/emmm.201201662. Epub 2012 Aug 24.

The "most wanted" taxa from the human microbiome for whole genome sequencing.人类微生物组中用于全基因组测序的“最受关注”的分类群。

PLoS One. 2012;7(7):e41294. doi: 10.1371/journal.pone.0041294. Epub 2012 Jul 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在不使用参考基因组的情况下，对复杂宏基因组样本中的基因组和遗传元件进行鉴定和组装。

Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献