Perola Emanuele
Vertex Pharmaceuticals, 130 Waverly Street, Cambridge, Massachusetts 02139, USA.
Proteins. 2006 Aug 1;64(2):422-35. doi: 10.1002/prot.21002.
In spite of recent improvements in docking and scoring methods, high false-positive rates remain a common issue in structure-based virtual screening. In this study, the distinctive features of false positives in kinase virtual screens were investigated. A series of retrospective virtual screens on kinase targets was performed on specifically designed test sets, each combining true ligands and experimentally confirmed inactive compounds. A systematic analysis of the docking poses generated for the top-ranking compounds highlighted key aspects differentiating true hits from false positives. The most recurring feature in the poses of false positives was the absence of certain key interactions known to be required for kinase binding. A systematic analysis of 444 crystal structures of ligand-bound kinases showed that at least two hydrogen bonds between the ligand and the backbone protein atoms in the kinase hinge region are present in 90% of the complexes, with very little variability across targets. Closer inspection showed that when the two hydrogen bonds are present, one of three preferred hinge-binding motifs is involved in 96.5% of the cases. Less than 10% of the false positives satisfied these two criteria in the minimized docking poses generated by our standard protocol. Ligand conformational artifacts were also shown to contribute to the occurrence of false positives in a number of cases. Application of this knowledge in the form of docking constraints and post-processing filters provided consistent improvements in virtual screening performance on all systems. The false-positive rates were significantly reduced and the enrichment factors increased by an average of twofold. On the basis of these results, a generalized two-step protocol for virtual screening on kinase targets is suggested.
尽管对接和评分方法最近有所改进,但在基于结构的虚拟筛选中,高假阳性率仍然是一个常见问题。在本研究中,对激酶虚拟筛选中假阳性的独特特征进行了研究。在专门设计的测试集上对一系列激酶靶点进行回顾性虚拟筛选,每个测试集都结合了真实配体和经实验确认的无活性化合物。对排名靠前的化合物生成的对接姿势进行系统分析,突出了区分真实命中与假阳性的关键方面。假阳性姿势中最常见的特征是缺乏激酶结合所需的某些关键相互作用。对444个配体结合激酶的晶体结构进行系统分析表明,90%的复合物中配体与激酶铰链区的主链蛋白原子之间至少存在两个氢键,不同靶点之间的变异性很小。进一步检查表明,当存在这两个氢键时,96.5%的情况下涉及三种优选的铰链结合基序之一。在我们的标准方案生成的最小化对接姿势中,不到10%的假阳性满足这两个标准。在许多情况下,配体构象伪像也被证明会导致假阳性的出现。以对接约束和后处理过滤器的形式应用这些知识,在所有系统的虚拟筛选性能方面都提供了一致的改进。假阳性率显著降低,富集因子平均提高了两倍。基于这些结果,提出了一种用于激酶靶点虚拟筛选的通用两步方案。