Teramoto Reiji, Fukunishi Hiroaki
Fundamental and Environmental Research Laboratories, NEC Corporation, 34, Miyukigaoka, Tsukuba, Ibaraki 305-8501, Japan.
J Chem Inf Model. 2007 Sep-Oct;47(5):1858-67. doi: 10.1021/ci700116z. Epub 2007 Aug 9.
Protein-ligand docking programs have been used to efficiently discover novel ligands for target proteins from large-scale compound databases. However, better scoring methods are needed. Generally, scoring functions are optimized by means of various techniques that affect their fitness for reproducing X-ray structures and protein-ligand binding affinities. However, these scoring functions do not always work well for all target proteins. A scoring function should be optimized for a target protein to enhance enrichment for structure-based virtual screening. To address this problem, we propose the supervised scoring model (SSM), which takes into account the protein-ligand binding process using docked ligand conformations with supervised learning for optimizing scoring functions against a target protein. SSM employs a rough linear correlation between binding free energy and the root mean square deviation of a native ligand for predicting binding energy. We applied SSM to the FlexX scoring function, that is, F-Score, with five different target proteins: thymidine kinase (TK), estrogen receptor (ER), acetylcholine esterase (AChE), phosphodiesterase 5 (PDE5), and peroxisome proliferator-activated receptor gamma (PPARgamma). For these five proteins, SSM always enhanced enrichment better than F-Score, exhibiting superior performance that was particularly remarkable for TK, AChE, and PPARgamma. We also demonstrated that SSM is especially good at enhancing enrichments of the top ranks of screened compounds, which is useful in practical drug screening.
蛋白质-配体对接程序已被用于从大规模化合物数据库中高效发现针对目标蛋白质的新型配体。然而,仍需要更好的评分方法。一般来说,评分函数通过各种技术进行优化,这些技术会影响它们在重现X射线结构和蛋白质-配体结合亲和力方面的适用性。然而,这些评分函数并非对所有目标蛋白质都能始终有效。评分函数应针对目标蛋白质进行优化,以提高基于结构的虚拟筛选的富集度。为了解决这个问题,我们提出了监督评分模型(SSM),该模型利用对接配体构象并结合监督学习来考虑蛋白质-配体结合过程,以针对目标蛋白质优化评分函数。SSM利用结合自由能与天然配体的均方根偏差之间的粗略线性相关性来预测结合能。我们将SSM应用于FlexX评分函数,即F-Score,针对五种不同的目标蛋白质:胸苷激酶(TK)、雌激素受体(ER)、乙酰胆碱酯酶(AChE)、磷酸二酯酶5(PDE5)和过氧化物酶体增殖物激活受体γ(PPARγ)。对于这五种蛋白质,SSM始终比F-Score能更好地提高富集度,表现出卓越的性能,在TK、AChE和PPARγ上尤为显著。我们还证明了SSM在增强筛选化合物顶级排名的富集度方面特别出色,这在实际药物筛选中很有用。