Buttenschoen Martin, Morris Garrett M, Deane Charlotte M
Department of Statistics 24-29 St Giles' Oxford OX1 3LB UK
Chem Sci. 2023 Dec 13;15(9):3130-3139. doi: 10.1039/d3sc04185a. eCollection 2024 Feb 28.
The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. The PoseBusters test suite validates chemical and geometric consistency of a ligand including its stereochemistry, and the physical plausibility of intra- and intermolecular measurements such as the planarity of aromatic rings, standard bond lengths, and protein-ligand clashes. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions.
在过去几年中,出现了众多基于深度学习的蛋白质-配体对接方法。这些方法在速度和准确性方面展现出巨大潜力。然而,尽管声称在晶体学均方根偏差(RMSD)方面具有最先进的性能,但经过仔细检查后发现,它们常常产生物理上不合理的分子结构。因此,仅通过与天然结合模式的RMSD来评估这些方法是不够的。至关重要的是,特别是对于基于深度学习的方法,还需根据空间和能量标准进行评估。我们提出了PoseBusters,这是一个Python软件包,它使用成熟的化学信息学工具包RDKit执行一系列标准质量检查。PoseBusters测试套件可验证配体的化学和几何一致性,包括其立体化学,以及分子内和分子间测量的物理合理性,如芳环的平面性、标准键长和蛋白质-配体冲突。只有通过这些检查并预测出类似天然结合模式的方法才应被归类为具有“最先进”的性能。我们使用PoseBusters来比较五种基于深度学习的对接方法(DeepDock、DiffDock、EquiBind、TankBind和Uni-Mol)以及两种成熟的标准对接方法(AutoDock Vina和CCDC Gold),有无使用分子力学力场进行额外的预测后能量最小化步骤。我们表明,无论是在物理合理性还是在泛化到与训练数据不同的示例的能力方面,尚无基于深度学习的方法优于经典对接工具。此外,我们发现分子力学力场包含深度学习方法中缺失的与对接相关的物理知识。PoseBusters允许从业者评估对接和分子生成方法,并可能激发改进基于深度学习的方法仍所需的新归纳偏差,这将有助于推动更准确和更现实预测的发展。