Center for Bioinformatics, University of Hamburg, Bundesstr. 43, 20146, Hamburg, Germany.
J Comput Aided Mol Des. 2012 Jun;26(6):701-23. doi: 10.1007/s10822-011-9531-0. Epub 2011 Dec 27.
The HYDE scoring function consistently describes hydrogen bonding, the hydrophobic effect and desolvation. It relies on HYdration and DEsolvation terms which are calibrated using octanol/water partition coefficients of small molecules. We do not use affinity data for calibration, therefore HYDE is generally applicable to all protein targets. HYDE reflects the Gibbs free energy of binding while only considering the essential interactions of protein-ligand complexes. The greatest benefit of HYDE is that it yields a very intuitive atom-based score, which can be mapped onto the ligand and protein atoms. This allows the direct visualization of the score and consequently facilitates analysis of protein-ligand complexes during the lead optimization process. In this study, we validated our new scoring function by applying it in large-scale docking experiments. We could successfully predict the correct binding mode in 93% of complexes in redocking calculations on the Astex diverse set, while our performance in virtual screening experiments using the DUD dataset showed significant enrichment values with a mean AUC of 0.77 across all protein targets with little or no structural defects. As part of these studies, we also carried out a very detailed analysis of the data that revealed interesting pitfalls, which we highlight here and which should be addressed in future benchmark datasets.
HYDE 评分函数能够一致地描述氢键、疏水性效应和去溶剂化作用。它依赖于水合和去溶剂化项,这些项是使用小分子的辛醇/水分配系数进行校准的。我们在校准中不使用亲和力数据,因此 HYDE 通常适用于所有蛋白质靶标。HYDE 反映了结合的吉布斯自由能,而只考虑了蛋白质-配体复合物的基本相互作用。HYDE 的最大优势在于它产生了一个非常直观的基于原子的分数,可以映射到配体和蛋白质原子上。这允许直接可视化分数,从而在先导优化过程中促进对蛋白质-配体复合物的分析。在这项研究中,我们通过在大规模对接实验中应用它来验证我们的新评分函数。我们能够在 Astex 多样数据集的重新对接计算中成功预测 93%的复合物的正确结合模式,而我们在使用 DUD 数据集的虚拟筛选实验中的性能显示出显著的富集值,所有蛋白质靶标的平均 AUC 为 0.77,几乎没有或没有结构缺陷。作为这些研究的一部分,我们还对数据进行了非常详细的分析,揭示了一些有趣的陷阱,我们在这里强调这些陷阱,并应在未来的基准数据集得到解决。