Beck Max L, Song Siyeon, Shuster Isra E, Miharia Aarzu, Walker Allison S
Department of Chemistry, Vanderbilt University. 1234 Stevenson Center Lane, Nashville, TN 37240, Untited States.
Department of Biological Sciences, Vanderbilt University. VU Station B, Box 35-1634, Nashville, TN 37235, Untited States.
J Ind Microbiol Biotechnol. 2023 Feb 17;50(1). doi: 10.1093/jimb/kuad024.
Bacteria have long been a source of natural products with diverse bioactivities that have been developed into therapeutics to treat human disease. Historically, researchers have focused on a few taxa of bacteria, mainly Streptomyces and other actinomycetes. This strategy was initially highly successful and resulted in the golden era of antibiotic discovery. The golden era ended when the most common antibiotics from Streptomyces had been discovered. Rediscovery of known compounds has plagued natural product discovery ever since. Recently, there has been increasing interest in identifying other taxa that produce bioactive natural products. Several bioinformatics studies have identified promising taxa with high biosynthetic capacity. However, these studies do not address the question of whether any of the products produced by these taxa are likely to have activities that will make them useful as human therapeutics. We address this gap by applying a recently developed machine learning tool that predicts natural product activity from biosynthetic gene cluster (BGC) sequences to determine which taxa are likely to produce compounds that are not only novel but also bioactive. This machine learning tool is trained on a dataset of BGC-natural product activity pairs and relies on counts of different protein domains and resistance genes in the BGC to make its predictions. We find that rare and understudied actinomycetes are the most promising sources for novel active compounds. There are also several taxa outside of actinomycetes that are likely to produce novel active compounds. We also find that most strains of Streptomyces likely produce both characterized and uncharacterized bioactive natural products. The results of this study provide guidelines to increase the efficiency of future bioprospecting efforts.
ONE-SENTENCE SUMMARY: This paper combines several bioinformatics workflows to identify which genera of bacteria are most likely to produce novel natural products with useful bioactivities such as antibacterial, antitumor, or antifungal activity.
长期以来,细菌一直是具有多种生物活性的天然产物的来源,这些天然产物已被开发成治疗人类疾病的药物。从历史上看,研究人员主要关注少数几类细菌,主要是链霉菌和其他放线菌。这一策略最初非常成功,带来了抗生素发现的黄金时代。当链霉菌中最常见的抗生素被发现后,黄金时代结束了。从那时起,已知化合物的重新发现一直困扰着天然产物的发现。最近,人们对鉴定其他能产生生物活性天然产物的类群越来越感兴趣。几项生物信息学研究已经确定了具有高生物合成能力的有前景的类群。然而,这些研究没有解决这些类群产生的任何产物是否可能具有使其作为人类治疗药物有用的活性这一问题。我们通过应用一种最近开发的机器学习工具来填补这一空白,该工具从生物合成基因簇(BGC)序列预测天然产物活性,以确定哪些类群可能产生不仅新颖而且具有生物活性的化合物。这种机器学习工具在一个BGC-天然产物活性对的数据集上进行训练,并依靠BGC中不同蛋白质结构域和抗性基因的计数来进行预测。我们发现罕见且研究不足的放线菌是新型活性化合物最有前景的来源。放线菌之外也有几个类群可能产生新型活性化合物。我们还发现,大多数链霉菌菌株可能产生已表征和未表征的生物活性天然产物。这项研究的结果为提高未来生物勘探工作的效率提供了指导方针。
本文结合了几种生物信息学工作流程,以确定哪些细菌属最有可能产生具有抗菌、抗肿瘤或抗真菌活性等有用生物活性的新型天然产物。