Suppr超能文献

使用底物特异性隐马尔可夫模型集合对 NRPS 和 PKS 系统的腺苷酰化和酰基转移酶活性进行分类。

Classification of the adenylation and acyl-transferase activity of NRPS and PKS systems using ensembles of substrate specific hidden Markov models.

机构信息

Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands.

出版信息

PLoS One. 2013 Apr 18;8(4):e62136. doi: 10.1371/journal.pone.0062136. Print 2013.

Abstract

There is a growing interest in the Non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) of microbes, fungi and plants because they can produce bioactive peptides such as antibiotics. The ability to identify the substrate specificity of the enzyme's adenylation (A) and acyl-transferase (AT) domains is essential to rationally deduce or engineer new products. We here report on a Hidden Markov Model (HMM)-based ensemble method to predict the substrate specificity at high quality. We collected a new reference set of experimentally validated sequences. An initial classification based on alignment and Neighbor Joining was performed in line with most of the previously published prediction methods. We then created and tested single substrate specific HMMs and found that their use improved the correct identification significantly for A as well as for AT domains. A major advantage of the use of HMMs is that it abolishes the dependency on multiple sequence alignment and residue selection that is hampering the alignment-based clustering methods. Using our models we obtained a high prediction quality for the substrate specificity of the A domains similar to two recently published tools that make use of HMMs or Support Vector Machines (NRPSsp and NRPS predictor2, respectively). Moreover, replacement of the single substrate specific HMMs by ensembles of models caused a clear increase in prediction quality. We argue that the superiority of the ensemble over the single model is caused by the way substrate specificity evolves for the studied systems. It is likely that this also holds true for other protein domains. The ensemble predictor has been implemented in a simple web-based tool that is available at http://www.cmbi.ru.nl/NRPS-PKS-substrate-predictor/.

摘要

人们对微生物、真菌和植物中的非核糖体肽合成酶(NRPSs)和聚酮合酶(PKSs)越来越感兴趣,因为它们可以产生抗生素等生物活性肽。鉴定酶的腺苷酰化(A)和酰基转移酶(AT)结构域的底物特异性的能力对于合理推断或设计新产品至关重要。我们在此报告了一种基于隐马尔可夫模型(HMM)的集成方法,可以高质量地预测底物特异性。我们收集了一组新的经过实验验证的序列作为参考集。根据大多数先前发表的预测方法,我们首先进行了基于比对和邻接法的初始分类。然后,我们创建并测试了单底物特异性 HMM,并发现它们的使用显著提高了 A 结构域和 AT 结构域的正确识别率。HMM 的一个主要优势是它消除了对多序列比对和残基选择的依赖,而这正是阻碍基于比对聚类方法的因素。使用我们的模型,我们获得了与最近发表的两种使用 HMM 或支持向量机(分别为 NRPSsp 和 NRPS predictor2)的工具相似的 A 结构域底物特异性的高预测质量。此外,用模型的集合替换单底物特异性 HMM 会明显提高预测质量。我们认为,集合优于单个模型的原因是研究系统中底物特异性的演变方式。对于其他蛋白质结构域,这很可能也是如此。该集成预测器已在一个简单的基于网络的工具中实现,可在 http://www.cmbi.ru.nl/NRPS-PKS-substrate-predictor/ 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d81/3630128/12f78600940a/pone.0062136.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验