Department of Software Engineering, Charles University in Prague, Prague, Czech Republic.
J Cheminform. 2015 Apr 1;7:12. doi: 10.1186/s13321-015-0059-5. eCollection 2015.
Protein-ligand binding site prediction from a 3D protein structure plays a pivotal role in rational drug design and can be helpful in drug side-effects prediction or elucidation of protein function. Embedded within the binding site detection problem is the problem of pocket ranking - how to score and sort candidate pockets so that the best scored predictions correspond to true ligand binding sites. Although there exist multiple pocket detection algorithms, they mostly employ a fairly simple ranking function leading to sub-optimal prediction results.
We have developed a new pocket scoring approach (named PRANK) that prioritizes putative pockets according to their probability to bind a ligand. The method first carefully selects pocket points and labels them by physico-chemical characteristics of their local neighborhood. Random Forests classifier is subsequently applied to assign a ligandability score to each of the selected pocket point. The ligandability scores are finally merged into the resulting pocket score to be used for prioritization of the putative pockets. With the used of multiple datasets the experimental results demonstrate that the application of our method as a post-processing step greatly increases the quality of the prediction of Fpocket and ConCavity, two state of the art protein-ligand binding site prediction algorithms.
The positive experimental results show that our method can be used to improve the success rate, validity and applicability of existing protein-ligand binding site prediction tools. The method was implemented as a stand-alone program that currently contains support for Fpocket and Concavity out of the box, but is easily extendible to support other tools. PRANK is made freely available at http://siret.ms.mff.cuni.cz/prank.
从三维蛋白质结构预测蛋白质-配体结合位点在合理药物设计中起着至关重要的作用,并有助于预测药物副作用或阐明蛋白质功能。在结合位点检测问题中还存在口袋排序问题——如何对候选口袋进行评分和排序,以便得分最高的预测对应于真正的配体结合位点。虽然存在多种口袋检测算法,但它们大多采用相当简单的排序函数,导致预测结果不理想。
我们开发了一种新的口袋评分方法(命名为 PRANK),根据口袋结合配体的概率对候选口袋进行优先级排序。该方法首先仔细选择口袋点,并根据其局部邻域的物理化学特性对其进行标记。随后应用随机森林分类器为每个选定的口袋点分配一个配体能力得分。最后,将配体能力得分合并到所得口袋得分中,用于候选口袋的优先级排序。通过使用多个数据集,实验结果表明,将我们的方法作为后处理步骤应用可以大大提高 Fpocket 和 ConCavity 这两种最先进的蛋白质-配体结合位点预测算法的预测质量。
阳性实验结果表明,我们的方法可用于提高现有蛋白质-配体结合位点预测工具的成功率、有效性和适用性。该方法被实现为一个独立的程序,目前已开箱即用支持 Fpocket 和 Concavity,但很容易扩展以支持其他工具。PRANK 可在 http://siret.ms.mff.cuni.cz/prank 上免费获得。