Pompeu Fabra University, Barcelona, Spain.
Lead Molecular Design, S. L, Sant Cugat del Vallés, Spain.
PLoS One. 2019 Jan 8;14(1):e0199270. doi: 10.1371/journal.pone.0199270. eCollection 2019.
Peptide drugs have been used in the treatment of multiple pathologies. During peptide discovery, it is crucially important to be able to map the potential sites of cleavages of the proteases. This knowledge is used to later chemically modify the peptide drug to adapt it for the therapeutic use, making peptide stable against individual proteases or in complex medias. In some other cases it needed to make it specifically unstable for some proteases, as peptides could be used as a system to target delivery drugs on specific tissues or cells. The information about proteases, their sites of cleavages and substrates are widely spread across publications and collected in databases such as MEROPS. Therefore, it is possible to develop models to improve the understanding of the potential peptide drug proteolysis. We propose a new workflow to derive protease specificity rules and predict the potential scissile bonds in peptides for individual proteases. WebMetabase stores the information from experimental or external sources in a chemically aware database where each peptide and site of cleavage is represented as a sequence of structural blocks connected by amide bonds and characterized by its physicochemical properties described by Volsurf descriptors. Thus, this methodology could be applied in the case of non-standard amino acid. A frequency analysis can be performed in WebMetabase to discover the most frequent cleavage sites. These results were used to train several models using logistic regression, support vector machine and ensemble tree classifiers to map cleavage sites for several human proteases from four different families (serine, cysteine, aspartic and matrix metalloproteases). Finally, we compared the predictive performance of the developed models with other available public tools PROSPERous and SitePrediction.
肽类药物已被用于治疗多种疾病。在肽类药物的发现过程中,能够对蛋白酶的潜在切割位点进行定位是至关重要的。这些知识可用于对肽类药物进行化学修饰,以适应治疗用途,使肽类药物对个别蛋白酶或复杂介质具有稳定性。在某些情况下,需要使肽类药物对某些蛋白酶具有特异性不稳定性,因为肽类药物可被用作系统,将药物靶向递送到特定的组织或细胞。有关蛋白酶、其切割位点和底物的信息广泛分布在出版物中,并收集在 MEROPS 等数据库中。因此,可以开发模型来提高对潜在肽类药物蛋白水解的理解。我们提出了一种新的工作流程,以推导出蛋白酶特异性规则,并预测单个蛋白酶中肽类的潜在切割键。WebMetabase 将来自实验或外部来源的信息存储在一个具有化学意识的数据库中,其中每个肽类和切割位点都表示为通过酰胺键连接的结构块序列,并通过 Volsurf 描述符描述的其物理化学特性进行特征化。因此,这种方法可应用于非标准氨基酸的情况。可以在 WebMetabase 中进行频率分析以发现最常见的切割位点。这些结果被用于使用逻辑回归、支持向量机和集成树分类器来训练多个模型,以映射来自四个不同家族(丝氨酸、半胱氨酸、天冬氨酸和基质金属蛋白酶)的几种人类蛋白酶的切割位点。最后,我们将开发的模型的预测性能与其他可用的公共工具 PROSPERous 和 SitePrediction 进行了比较。