Teramoto Reiji, Fukunishi Hiroaki
Bio-IT Center, NEC Corporation, 34, Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan.
J Chem Inf Model. 2008 Feb;48(2):288-95. doi: 10.1021/ci700239t. Epub 2008 Jan 30.
The evaluation of ligand conformations is a crucial aspect of structure-based virtual screening, and scoring functions play significant roles in it. While consensus scoring (CS) generally improves enrichment by compensating for the deficiencies of each scoring function, the strategy of how individual scoring functions are selected remains a challenging task when few known active compounds are available. To address this problem, we propose feature selection-based consensus scoring (FSCS), which performs supervised feature selection with docked native ligand conformations to select complementary scoring functions. We evaluated the enrichments of five scoring functions (F-Score, D-Score, PMF, G-Score, and ChemScore), FSCS, and RCS (rank-by-rank consensus scoring) for four different target proteins: acetylcholine esterase (AChE), thrombin (thrombin), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). The results indicated that FSCS was able to select the complementary scoring functions and enhance ligand enrichments and that it outperformed RCS and the individual scoring functions for all target proteins. They also indicated that the performances of the single scoring functions were strongly dependent on the target protein. An especially favorable result with implications for practical drug screening is that FSCS performs well even if only one 3D structure of the protein-ligand complex is known. Moreover, we found that one can infer which scoring functions significantly enrich active compounds by using feature selection before actual docking and that the selected scoring functions are complementary.
配体构象的评估是基于结构的虚拟筛选的关键环节,评分函数在其中起着重要作用。虽然共识评分(CS)通常通过弥补每个评分函数的不足来提高富集效果,但在已知活性化合物较少的情况下,如何选择单个评分函数的策略仍然是一项具有挑战性的任务。为了解决这个问题,我们提出了基于特征选择的共识评分(FSCS),它利用对接的天然配体构象进行有监督的特征选择,以选择互补的评分函数。我们评估了五种评分函数(F-Score、D-Score、PMF、G-Score和ChemScore)、FSCS以及秩次秩次共识评分(RCS)对四种不同靶蛋白的富集效果:乙酰胆碱酯酶(AChE)、凝血酶(thrombin)、磷酸二酯酶5(PDE5)和过氧化物酶体增殖物激活受体γ(PPARgamma)。结果表明,FSCS能够选择互补的评分函数并增强配体富集效果,并且在所有靶蛋白上均优于RCS和单个评分函数。结果还表明,单个评分函数的性能强烈依赖于靶蛋白。一个对实际药物筛选有重要意义的特别有利的结果是,即使仅知道蛋白质-配体复合物的一个三维结构,FSCS也表现良好。此外,我们发现可以在实际对接之前通过特征选择推断出哪些评分函数能显著富集活性化合物,并且所选的评分函数是互补的。