Leão Tiago F, Wang Mingxun, da Silva Ricardo, Gurevich Alexey, Bauermeister Anelize, Gomes Paulo Wender P, Brejnrod Asker, Glukhov Evgenia, Aron Allegra T, Louwen Joris J R, Kim Hyun Woo, Reher Raphael, Fiore Marli F, van der Hooft Justin J J, Gerwick Lena, Gerwick William H, Bandeira Nuno, Dorrestein Pieter C
Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA.
Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba 13400-970, SP, Brazil.
PNAS Nexus. 2022 Nov 16;1(5):pgac257. doi: 10.1093/pnasnexus/pgac257. eCollection 2022 Nov.
Microbial specialized metabolites are an important source of and inspiration for many pharmaceuticals, biotechnological products and play key roles in ecological processes. Untargeted metabolomics using liquid chromatography coupled with tandem mass spectrometry is an efficient technique to access metabolites from fractions and even environmental crude extracts. Nevertheless, metabolomics is limited in predicting structures or bioactivities for cryptic metabolites. Efficiently linking the biosynthetic potential inferred from (meta)genomics to the specialized metabolome would accelerate drug discovery programs by allowing metabolomics to make use of genetic predictions. Here, we present a -nearest neighbor classifier to systematically connect mass spectrometry fragmentation spectra to their corresponding biosynthetic gene clusters (independent of their chemical class). Our new pattern-based genome mining pipeline links biosynthetic genes to metabolites that they encode for, as detected via mass spectrometry from bacterial cultures or environmental microbiomes. Using paired datasets that include validated genes-mass spectral links from the Paired Omics Data Platform, we demonstrate this approach by automatically linking 18 previously known mass spectra (17 for which the biosynthesis gene clusters can be found at the MIBiG database plus palmyramide A) to their corresponding previously experimentally validated biosynthetic genes (e.g., via nuclear magnetic resonance or genetic engineering). We illustrated a computational example of how to use our Natural Products Mixed Omics (NPOmix) tool for siderophore mining that can be reproduced by the users. We conclude that NPOmix minimizes the need for culturing (it worked well on microbiomes) and facilitates specialized metabolite prioritization based on integrative omics mining.
微生物特殊代谢产物是许多药物和生物技术产品的重要来源及灵感源泉,并且在生态过程中发挥关键作用。使用液相色谱联用串联质谱的非靶向代谢组学是一种从馏分甚至环境粗提物中获取代谢产物的有效技术。然而,代谢组学在预测隐秘代谢产物的结构或生物活性方面存在局限性。通过使代谢组学能够利用基因预测,将从(宏)基因组学推断出的生物合成潜力与特殊代谢组有效联系起来,将加速药物发现计划。在此,我们提出一种最近邻分类器,以系统地将质谱碎裂谱与其相应的生物合成基因簇(与其化学类别无关)相连接。我们基于新模式的基因组挖掘流程将生物合成基因与其编码的代谢产物相联系,这些代谢产物是通过对细菌培养物或环境微生物群落进行质谱检测得到的。利用包括来自配对组学数据平台的经过验证的基因 - 质谱链接的配对数据集,我们通过自动将18个先前已知的质谱(其中17个在MIBiG数据库中可找到其生物合成基因簇,加上棕榈酰胺A)与其相应的先前经过实验验证的生物合成基因(例如,通过核磁共振或基因工程)相连接,展示了这种方法。我们举例说明了如何使用我们的天然产物混合组学(NPOmix)工具进行铁载体挖掘的计算示例,用户可以重现该示例。我们得出结论,NPOmix将培养需求降至最低(在微生物群落上效果良好),并基于综合组学挖掘促进特殊代谢产物的优先级排序。