Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, 59000, Lille, France.
Computer Science Department, Hodeidah University, Hodeidah, Yemen.
J Comput Aided Mol Des. 2020 Nov;34(11):1147-1156. doi: 10.1007/s10822-020-00336-8. Epub 2020 Aug 19.
Previously a fingerprint based on monomer composition (MCFP) of nonribosomal peptides (NRPs) has been introduced. MCFP is a novel method for obtaining a representative description of NRP structures from their monomer composition in a fingerprint form. An effective screening and prediction of biological activities has been obtained from Norine NRPs database. In this paper, we present an extension of the MCFP fingerprint. This extension is based on adding few columns into the fingerprint; representing monomer clusters, 2D structures, peptide categories, and peptide diversity. All these data have been extracted from the NRP structure. Experiments with Norine NRPs database showed that the extended MCFP, that can be called Monomer Structure FingerPrint (MSFP) produced high prediction accuracy (> 95%) together with a high recall rate (86%) obtained when MSFP was used for prediction and similarity searching. From this study it appeared that MSFP mainly built from monomer composition can substantially be improved by adding more columns representing useful information about monomer composition and 2D structure of NRPs.
先前已经引入了基于非核糖体肽 (NRP) 单体组成 (MCFP) 的指纹。MCFP 是一种从单体组成中以指纹形式获得 NRP 结构代表性描述的新方法。已经从 Norine NRPs 数据库中获得了有效的筛选和生物活性预测。在本文中,我们提出了 MCFP 指纹的扩展。该扩展基于向指纹中添加几列; 代表单体簇、2D 结构、肽类别和肽多样性。所有这些数据均从 NRP 结构中提取。使用 Norine NRPs 数据库进行的实验表明,扩展的 MCFP(可称为单体结构指纹 (MSFP))在用于预测和相似性搜索时产生了高预测准确性(>95%)和高召回率(86%)。从这项研究中可以看出,主要由单体组成构建的 MSFP 可以通过添加更多列来大大改进,这些列代表有关 NRP 单体组成和 2D 结构的有用信息。