Kloosterman Alexander M, Shelton Kyle E, van Wezel Gilles P, Medema Marnix H, Mitchell Douglas A
Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands.
Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.
mSystems. 2020 Sep 1;5(5):e00267-20. doi: 10.1128/mSystems.00267-20.
Many ribosomally synthesized and posttranslationally modified peptide classes (RiPPs) are reliant on a domain called the RiPP recognition element (RRE). The RRE binds specifically to a precursor peptide and directs the posttranslational modification enzymes to their substrates. Given its prevalence across various types of RiPP biosynthetic gene clusters (BGCs), the RRE could theoretically be used as a bioinformatic handle to identify novel classes of RiPPs. In addition, due to the high affinity and specificity of most RRE-precursor peptide complexes, a thorough understanding of the RRE domain could be exploited for biotechnological applications. However, sequence divergence of RREs across RiPP classes has precluded automated identification based solely on sequence similarity. Here, we introduce RRE-Finder, a new tool for identifying RRE domains with high sensitivity. RRE-Finder can be used in precision mode to confidently identify RREs in a class-specific manner or in exploratory mode to assist in the discovery of novel RiPP classes. RRE-Finder operating in precision mode on the UniProtKB protein database retrieved ∼25,000 high-confidence RREs spanning all characterized RRE-dependent RiPP classes, as well as several yet-uncharacterized RiPP classes that require future experimental confirmation. Finally, RRE-Finder was used in precision mode to explore a possible evolutionary origin of the RRE domain. The results suggest RREs originated from a co-opted DNA-binding transcriptional regulator domain. Altogether, RRE-Finder provides a powerful new method to probe RiPP biosynthetic diversity and delivers a rich data set of RRE sequences that will provide a foundation for deeper biochemical studies into this intriguing and versatile protein domain. Bioinformatics-powered discovery of novel ribosomal natural products (RiPPs) has historically been hindered by the lack of a common genetic feature across RiPP classes. Herein, we introduce RRE-Finder, a method for identifying RRE domains, which are present in a majority of prokaryotic RiPP biosynthetic gene clusters (BGCs). RRE-Finder identifies RRE domains 3,000 times faster than current methods, which rely on time-consuming secondary structure prediction. Depending on user goals, RRE-Finder can operate in precision mode to accurately identify RREs present in known RiPP classes or in exploratory mode to assist with novel RiPP discovery. Employing RRE-Finder on the UniProtKB database revealed several high-confidence RREs in novel RiPP-like clusters, suggesting that many new RiPP classes remain to be discovered.
许多核糖体合成并经翻译后修饰的肽类(RiPPs)依赖于一个称为RiPP识别元件(RRE)的结构域。RRE特异性结合前体肽,并将翻译后修饰酶导向其底物。鉴于其在各种类型的RiPP生物合成基因簇(BGCs)中普遍存在,理论上RRE可作为一种生物信息学工具来识别新型RiPPs。此外,由于大多数RRE-前体肽复合物具有高亲和力和特异性,深入了解RRE结构域可用于生物技术应用。然而,不同RiPP类别的RRE序列差异使得仅基于序列相似性进行自动识别变得困难。在此,我们介绍了RRE-Finder,一种用于高灵敏度识别RRE结构域的新工具。RRE-Finder可以在精确模式下以类别特异性方式可靠地识别RRE,也可以在探索模式下协助发现新型RiPP类别。在UniProtKB蛋白质数据库上以精确模式运行的RRE-Finder检索到约25,000个高可信度的RRE,涵盖所有已表征的依赖RRE的RiPP类别,以及几个需要未来实验证实的未表征的RiPP类别。最后,RRE-Finder在精确模式下用于探索RRE结构域可能的进化起源。结果表明RRE起源于一个被征用的DNA结合转录调节结构域。总之,RRE-Finder提供了一种强大的新方法来探究RiPP生物合成多样性,并提供了丰富的RRE序列数据集,这将为深入研究这个有趣且多功能的蛋白质结构域奠定生化基础。基于生物信息学发现新型核糖体天然产物(RiPPs)在历史上一直受到RiPP类别缺乏共同遗传特征的阻碍。在此,我们介绍了RRE-Finder,一种识别RRE结构域的方法,这些结构域存在于大多数原核RiPP生物合成基因簇(BGCs)中。RRE-Finder识别RRE结构域的速度比目前依赖耗时二级结构预测的方法快3000倍。根据用户目标,RRE-Finder可以在精确模式下准确识别已知RiPP类别中存在的RRE,或在探索模式下协助发现新型RiPP。在UniProtKB数据库上使用RRE-Finder揭示了新型RiPP样簇中的几个高可信度RRE,这表明仍有许多新的RiPP类别有待发现。