Lu I-Lin, Wang Hsiuying
Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan.
J Comput Biol. 2012 Nov;19(11):1215-26. doi: 10.1089/cmb.2012.0188. Epub 2012 Oct 17.
Protein-based virtual screening plays an important role in modern drug discovery process. Most protein-based virtual screening experiments are carried out with docking programs. The accuracy of a docking program highly relies on the incorporated scoring function based on various energy terms. The existing scoring functions deal all the energy terms with the equal weight function or other weight function derived by physical characteristics. These existing scoring functions are not protein dependent. We expect that a protein-specific scoring function, which can reflect the protein characteristics, may improve the docking results. Therefore, we propose a protein-specific rescoring approach to select potential ligands by adjusting the weights of energy terms. The protein-specific scoring function is based on the linear regression analysis associated with an outlier detection approach. The scoring function incorporated in DOCK program is used as the model system. The performance of our method was evaluated by the DUD docked data set, which contains 40 protein targets. The study results show that this method can improve the enrichment factors for most of the 40 protein targets. We further expend the protein-specific scoring function to a larger database, and the results also show significant improvement. Our method is not limited to improving the DOCK scoring function. It can be adopted to improve other programs such as GOLD and Glide. We believe that this method can be applied to virtual screening experiments and elevates the hits rate significantly, which can be beneficial to the modern drug discovery process.
基于蛋白质的虚拟筛选在现代药物发现过程中发挥着重要作用。大多数基于蛋白质的虚拟筛选实验是通过对接程序进行的。对接程序的准确性高度依赖于基于各种能量项的计分函数。现有的计分函数对所有能量项采用等权重函数或根据物理特性推导的其他权重函数进行处理。这些现有的计分函数不依赖于蛋白质。我们期望一种能够反映蛋白质特性的蛋白质特异性计分函数可能会改善对接结果。因此,我们提出了一种蛋白质特异性重新计分方法,通过调整能量项的权重来选择潜在配体。蛋白质特异性计分函数基于与异常值检测方法相关的线性回归分析。DOCK程序中包含的计分函数用作模型系统。我们的方法的性能通过包含40个蛋白质靶点的DUD对接数据集进行评估。研究结果表明,该方法可以提高40个蛋白质靶点中大多数的富集因子。我们进一步将蛋白质特异性计分函数扩展到更大的数据库,结果也显示出显著改善。我们的方法不仅限于改进DOCK计分函数。它可以用于改进其他程序,如GOLD和Glide。我们相信这种方法可以应用于虚拟筛选实验并显著提高命中率,这对现代药物发现过程可能是有益的。