Microbiome Program, Center for Individualized Medicine, Mayo Clinicgrid.66875.3a, Rochester, Minnesota, USA.
Division of Surgery Research, Department of Surgery, Mayo Clinicgrid.66875.3a, Rochester, Minnesota, USA.
mSystems. 2022 Dec 20;7(6):e0092522. doi: 10.1128/msystems.00925-22. Epub 2022 Nov 15.
Biosynthetic gene clusters (BGCs) in microbial genomes encode bioactive secondary metabolites (SMs), which can play important roles in microbe-microbe and host-microbe interactions. Given the biological significance of SMs and the current profound interest in the metabolic functions of microbiomes, the unbiased identification of BGCs from high-throughput metagenomic data could offer novel insights into the complex chemical ecology of microbial communities. Currently available tools for predicting BGCs from shotgun metagenomes have several limitations, including the need for computationally demanding read assembly, predicting a narrow breadth of BGC classes, and not providing the SM product. To overcome these limitations, we developed onomy-guided dentification of iosynthetic ene lusters (TaxiBGC), a command-line tool for predicting experimentally characterized BGCs (and inferring their known SMs) in metagenomes by first pinpointing the microbial species likely to harbor them. We benchmarked TaxiBGC on various simulated metagenomes, showing that our taxonomy-guided approach could predict BGCs with much-improved performance (mean F score, 0.56; mean PPV score, 0.80) compared with directly identifying BGCs by mapping sequencing reads onto the BGC genes (mean F score, 0.49; mean PPV score, 0.41). Next, by applying TaxiBGC on 2,650 metagenomes from the Human Microbiome Project and various case-control gut microbiome studies, we were able to associate BGCs (and their SMs) with different human body sites and with multiple diseases, including Crohn's disease and liver cirrhosis. In all, TaxiBGC provides an platform to predict experimentally characterized BGCs and their SM production potential in metagenomic data while demonstrating important advantages over existing techniques. Currently available bioinformatics tools to identify BGCs from metagenomic sequencing data are limited in their predictive capability or ease of use to even computationally oriented researchers. We present an automated computational pipeline called TaxiBGC, which predicts experimentally characterized BGCs (and infers their known SMs) in shotgun metagenomes by first considering the microbial species source. Through rigorous benchmarking techniques on simulated metagenomes, we show that TaxiBGC provides a significant advantage over existing methods. When demonstrating TaxiBGC on thousands of human microbiome samples, we associate BGCs encoding bacteriocins with different human body sites and diseases, thereby elucidating a possible novel role of this antibiotic class in maintaining the stability of microbial ecosystems throughout the human body. Furthermore, we report for the first time gut microbial BGC associations shared among multiple pathologies. Ultimately, we expect our tool to facilitate future investigations into the chemical ecology of microbial communities across diverse niches and pathologies.
生物合成基因簇(BGCs)在微生物基因组中编码生物活性的次生代谢产物(SMs),这些产物在微生物-微生物和宿主-微生物的相互作用中起着重要作用。鉴于 SMs 的生物学意义以及目前对微生物组代谢功能的浓厚兴趣,从高通量宏基因组数据中无偏地识别 BGCs 可以为微生物群落的复杂化学生态提供新的见解。目前用于从 shotgun 宏基因组中预测 BGCs 的工具存在一些局限性,包括需要计算量大的读取组装、预测 BGC 类的范围较窄,以及不提供 SM 产物。为了克服这些限制,我们开发了基于分类学的生物合成基因簇识别工具(TaxiBGC),这是一种命令行工具,用于通过首先确定可能携带它们的微生物物种,在宏基因组中预测经过实验表征的 BGCs(并推断它们已知的 SMs)。我们在各种模拟的宏基因组上对 TaxiBGC 进行了基准测试,结果表明,与直接通过将测序reads 映射到 BGC 基因上来识别 BGCs 相比,我们的分类学指导方法可以显著提高 BGCs 的预测性能(平均 F 分数为 0.56;平均 PPV 分数为 0.80)。接下来,通过将 TaxiBGC 应用于来自人类微生物组计划和各种病例对照肠道微生物组研究的 2650 个宏基因组,我们能够将 BGCs(及其 SMs)与不同的人体部位和多种疾病相关联,包括克罗恩病和肝硬化。总之,TaxiBGC 提供了一个平台,可以在宏基因组数据中预测经过实验表征的 BGCs 及其 SM 产生潜力,同时与现有技术相比具有重要优势。目前可用于从宏基因组测序数据中识别 BGCs 的生物信息学工具在预测能力或甚至对计算导向的研究人员的易用性方面都存在局限性。我们提出了一种名为 TaxiBGC 的自动化计算管道,该管道通过首先考虑微生物物种来源,预测 shotgun 宏基因组中的实验表征 BGCs(并推断它们已知的 SMs)。通过对模拟宏基因组进行严格的基准测试技术,我们表明 TaxiBGC 提供了比现有方法显著的优势。在对数千个人类微生物组样本进行 TaxiBGC 演示时,我们将编码细菌素的 BGCs 与不同的人体部位和疾病相关联,从而阐明了这种抗生素类在维持人体微生物生态系统稳定性方面的可能新作用。此外,我们首次报告了多种病理情况下肠道微生物 BGC 关联。最终,我们希望我们的工具能够促进未来对不同生态位和病理条件下微生物群落化学生态的研究。