Molecular Modeling Section, Department of Pharmaceutical and Pharmacological Sciences, University of Padova, 35131 Padova, Italy.
Int J Mol Sci. 2019 Jul 20;20(14):3558. doi: 10.3390/ijms20143558.
The number of entries in the Protein Data Bank (PDB) has doubled in the last decade, and it has increased tenfold in the last twenty years. The availability of an ever-growing number of structures is having a huge impact on the Structure-Based Drug Discovery (SBDD), allowing investigation of new targets and giving the possibility to have multiple structures of the same macromolecule in a complex with different ligands. Such a large resource often implies the choice of the most suitable complex for molecular docking calculation, and this task is complicated by the plethora of possible posing and scoring function algorithms available, which may influence the quality of the outcomes. Here, we report a large benchmark performed on the PDBbind database containing more than four thousand entries and seventeen popular docking protocols. We found that, even in protein families wherein docking protocols generally showed acceptable results, certain ligand-protein complexes are poorly reproduced in the self-docking procedure. Such a trend in certain protein families is more pronounced, and this underlines the importance in identification of a suitable protein-ligand conformation coupled to a well-performing docking protocol.
在过去的十年中,蛋白质数据库(PDB)中的条目数量增加了一倍,在过去的二十年中增加了十倍。越来越多的结构的可用性对基于结构的药物发现(SBDD)产生了巨大的影响,允许对新的靶标进行研究,并有可能使同一大分子与不同配体形成复合物的多个结构。如此庞大的资源通常意味着需要选择最适合分子对接计算的复合物,而这一任务因可用的 posing 和评分函数算法过多而变得复杂,这可能会影响结果的质量。在这里,我们报告了一项在包含超过四千个条目和十七个流行对接协议的 PDBbind 数据库上进行的大型基准测试。我们发现,即使在对接协议通常显示可接受结果的蛋白质家族中,某些配体-蛋白质复合物在自身对接过程中也不能很好地重现。在某些蛋白质家族中,这种趋势更为明显,这强调了与表现良好的对接协议相结合识别合适的蛋白质-配体构象的重要性。