Ihmaid Saleh K, Ahmed Hany E A, Zayed Mohamed F, Abadleh Mohammed M
Pharmacognosy and Pharmaceutical Chemistry Department, College of Pharmacy, Taibah University, P. O. Box 30039, Al-Madinah Al-Munawarah 41477, Saudi Arabia.
School of Pharmacy and Applied Science, La Trobe University, P. O. Box 199, Bendigo 3552, Australia.
Molecules. 2016 Jan 30;21(2):175. doi: 10.3390/molecules21020175.
The main step in a successful drug discovery pipeline is the identification of small potent compounds that selectively bind to the target of interest with high affinity. However, there is still a shortage of efficient and accurate computational methods with powerful capability to study and hence predict compound selectivity properties. In this work, we propose an affordable machine learning method to perform compound selectivity classification and prediction. For this purpose, we have collected compounds with reported activity and built a selectivity database formed of 153 cathepsin K and S inhibitors that are considered of medicinal interest. This database has three compound sets, two K/S and S/K selective ones and one non-selective KS one. We have subjected this database to the selectivity classification tool 'Emergent Self-Organizing Maps' for exploring its capability to differentiate selective cathepsin inhibitors for one target over the other. The method exhibited good clustering performance for selective ligands with high accuracy (up to 100 %). Among the possibilites, BAPs and MACCS molecular structural fingerprints were used for such a classification. The results exhibited the ability of the method for structure-selectivity relationship interpretation and selectivity markers were identified for the design of further novel inhibitors with high activity and target selectivity.
成功的药物研发流程中的主要步骤是识别能够以高亲和力选择性结合目标靶点的强效小分子化合物。然而,目前仍缺乏高效且准确的计算方法来研究并预测化合物的选择性特性。在这项工作中,我们提出了一种经济实惠的机器学习方法来进行化合物选择性分类和预测。为此,我们收集了具有已报道活性的化合物,并构建了一个由153种组织蛋白酶K和S抑制剂组成的选择性数据库,这些抑制剂具有药用价值。该数据库有三个化合物集,两个分别对组织蛋白酶K/组织蛋白酶S和组织蛋白酶S/组织蛋白酶K具有选择性的集,以及一个对两者均无选择性的集。我们将该数据库应用于选择性分类工具“涌现自组织映射”,以探索其区分针对一个靶点而非另一个靶点的选择性组织蛋白酶抑制剂的能力。该方法对选择性配体表现出良好的聚类性能,准确率高达100%。在各种可能性中,BAPs和MACCS分子结构指纹被用于此类分类。结果展示了该方法解释结构-选择性关系的能力,并识别出了选择性标记,用于设计具有高活性和靶点选择性的新型抑制剂。