Computer-Aided Drug Design Laboratory, Department of Pharmacy Birla Institute of Technology, Hyderabad Campus, Shameerpet, Hyderbad, 500078, India.
J Cheminform. 2013 Jan 14;5(1):2. doi: 10.1186/1758-2946-5-2.
Mycobacterium tuberculosis encodes 11 putative serine-threonine proteins Kinases (STPK) which regulates transcription, cell development and interaction with the host cells. From the 11 STPKs three kinases namely PknA, PknB and PknG have been related to the mycobacterial growth. From previous studies it has been observed that PknB is essential for mycobacterial growth and expressed during log phase of the growth and phosphorylates substrates involved in peptidoglycan biosynthesis. In recent years many high affinity inhibitors are reported for PknB. Previously implementation of data fusion has shown effective enrichment of active compounds in both structure and ligand based approaches .In this study we have used three types of data fusion ranking algorithms on the PknB dataset namely, sum rank, sum score and reciprocal rank. We have identified reciprocal rank algorithm is capable enough to select compounds earlier in a virtual screening process. We have also screened the Asinex database with reciprocal rank algorithm to identify possible inhibitors for PknB.
In our work we have used both structure-based and ligand-based approaches for virtual screening, and have combined their results using a variety of data fusion methods. We found that data fusion increases the chance of actives being ranked highly. Specifically, we found that the ranking of Pharmacophore search, ROCS and Glide XP fused with a reciprocal ranking algorithm not only outperforms structure and ligand based approaches but also capable of ranking actives better than the other two data fusion methods using the BEDROC, robust initial enhancement (RIE) and AUC metrics. These fused results were used to identify 45 candidate compounds for further experimental validation.
We show that very different structure and ligand based methods for predicting drug-target interactions can be combined effectively using data fusion, outperforming any single method in ranking of actives. Such fused results show promise for a coherent selection of candidates for biological screening.
结核分枝杆菌编码 11 种假定的丝氨酸-苏氨酸蛋白激酶(STPK),这些激酶调节转录、细胞发育和与宿主细胞的相互作用。在这 11 种 STPK 中,有 3 种激酶(PknA、PknB 和 PknG)与分枝杆菌的生长有关。从以前的研究中可以看出,PknB 是分枝杆菌生长所必需的,并且在生长的对数期表达,并磷酸化参与肽聚糖生物合成的底物。近年来,已经报道了许多针对 PknB 的高亲和力抑制剂。以前的数据融合实施表明,在结构和配体两种方法中,对活性化合物的富集都非常有效。在这项研究中,我们在 PknB 数据集上使用了三种类型的数据融合排序算法,即和排序、和评分和倒数排序。我们发现倒数排序算法足以在虚拟筛选过程中更早地选择化合物。我们还使用倒数排序算法对 Asinex 数据库进行了筛选,以鉴定 PknB 的可能抑制剂。
在我们的工作中,我们同时使用了基于结构和基于配体的方法进行虚拟筛选,并使用多种数据融合方法结合了它们的结果。我们发现数据融合增加了活性化合物被高排名的机会。具体来说,我们发现药效团搜索、ROC 和 Glide XP 的排名与倒数排序算法融合,不仅优于结构和配体方法,而且在使用 BEDROC、鲁棒初始增强(RIE)和 AUC 指标进行评分时,比其他两种数据融合方法更能有效地对活性化合物进行评分。这些融合的结果被用来识别 45 种候选化合物进行进一步的实验验证。
我们表明,用于预测药物-靶标相互作用的非常不同的结构和配体方法可以通过数据融合有效地结合起来,在活性化合物的排序方面优于任何单一方法。这种融合的结果为生物筛选的候选物的选择提供了希望。