Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, PR China.
PLoS Comput Biol. 2023 May 15;19(5):e1011100. doi: 10.1371/journal.pcbi.1011100. eCollection 2023 May.
Non-ribosomal peptide synthetase (NRPS) is a diverse family of biosynthetic enzymes for the assembly of bioactive peptides. Despite advances in microbial sequencing, the lack of a consistent standard for annotating NRPS domains and modules has made data-driven discoveries challenging. To address this, we introduced a standardized architecture for NRPS, by using known conserved motifs to partition typical domains. This motif-and-intermotif standardization allowed for systematic evaluations of sequence properties from a large number of NRPS pathways, resulting in the most comprehensive cross-kingdom C domain subtype classifications to date, as well as the discovery and experimental validation of novel conserved motifs with functional significance. Furthermore, our coevolution analysis revealed important barriers associated with re-engineering NRPSs and uncovered the entanglement between phylogeny and substrate specificity in NRPS sequences. Our findings provide a comprehensive and statistically insightful analysis of NRPS sequences, opening avenues for future data-driven discoveries.
非核糖体肽合成酶(NRPS)是一类多样化的生物合成酶,用于组装生物活性肽。尽管在微生物测序方面取得了进展,但由于缺乏一致的 NRPS 结构域和模块注释标准,使得基于数据的发现具有挑战性。为了解决这个问题,我们引入了一种标准化的 NRPS 架构,使用已知的保守基序来划分典型的结构域。这种基序和基序间标准化允许对大量 NRPS 途径的序列特性进行系统评估,从而得出迄今为止最全面的跨域 C 结构域亚型分类,并发现和实验验证具有功能意义的新型保守基序。此外,我们的共进化分析揭示了与 NRPS 工程改造相关的重要障碍,并揭示了 NRPS 序列中系统发育和底物特异性之间的纠缠关系。我们的研究结果为 NRPS 序列提供了全面而具有统计洞察力的分析,为未来的数据驱动发现开辟了途径。