Mohite Omkar S, Lloyd Colton J, Monk Jonathan M, Weber Tilmann, Palsson Bernhard O
The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, 2800, Denmark.
Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA.
Synth Syst Biotechnol. 2022 May 6;7(3):900-910. doi: 10.1016/j.synbio.2022.04.011. eCollection 2022 Sep.
In silico genome mining provides easy access to secondary metabolite biosynthetic gene clusters (BGCs) encoding the biosynthesis of many bioactive compounds, which are the basis for many important drugs used in human medicine. However, the association between BGCs and other functions encoded in the genomes of producers have remained elusive. Here, we present a systems biology workflow that integrates genome mining with a detailed pangenome analysis for detecting genes associated with a particular BGC. We analyzed 3,889 enterobacterial genomes and found 13,266 BGCs, represented by 252 distinct BGC families and 347 additional singletons. A pangenome analysis revealed 88 genes putatively associated with a specific BGC coding for the colon cancer-related colibactin that code for diverse metabolic and regulatory functions The presented workflow opens up the possibility to discover novel secondary metabolites, better understand their physiological roles, and provides a guide to identify and analyze BGC associated gene sets.
计算机基因组挖掘能够轻松获取编码许多生物活性化合物生物合成的次级代谢物生物合成基因簇(BGC),这些生物活性化合物是人类医学中许多重要药物的基础。然而,BGC与产生菌基因组中编码的其他功能之间的关联仍然难以捉摸。在此,我们提出了一种系统生物学工作流程,该流程将基因组挖掘与详细的泛基因组分析相结合,以检测与特定BGC相关的基因。我们分析了3889个肠杆菌基因组,发现了13266个BGC,由252个不同的BGC家族和347个额外的单拷贝代表。泛基因组分析揭示了88个可能与编码结肠癌相关大肠杆菌素的特定BGC相关的基因,这些基因编码多种代谢和调节功能。所提出的工作流程为发现新型次级代谢物、更好地理解其生理作用提供了可能性,并为识别和分析与BGC相关的基因集提供了指导。