PoseBusters：基于人工智能的对接方法无法生成符合物理原理的构象，也无法推广到新序列。

PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences.

作者信息

Buttenschoen Martin, Morris Garrett M, Deane Charlotte M

机构信息

Department of Statistics 24-29 St Giles' Oxford OX1 3LB UK

出版信息

Chem Sci. 2023 Dec 13;15(9):3130-3139. doi: 10.1039/d3sc04185a. eCollection 2024 Feb 28.

DOI:10.1039/d3sc04185a

PMID:38425520

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10901501/

Abstract

The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. The PoseBusters test suite validates chemical and geometric consistency of a ligand including its stereochemistry, and the physical plausibility of intra- and intermolecular measurements such as the planarity of aromatic rings, standard bond lengths, and protein-ligand clashes. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions.

摘要

在过去几年中，出现了众多基于深度学习的蛋白质-配体对接方法。这些方法在速度和准确性方面展现出巨大潜力。然而，尽管声称在晶体学均方根偏差（RMSD）方面具有最先进的性能，但经过仔细检查后发现，它们常常产生物理上不合理的分子结构。因此，仅通过与天然结合模式的RMSD来评估这些方法是不够的。至关重要的是，特别是对于基于深度学习的方法，还需根据空间和能量标准进行评估。我们提出了PoseBusters，这是一个Python软件包，它使用成熟的化学信息学工具包RDKit执行一系列标准质量检查。PoseBusters测试套件可验证配体的化学和几何一致性，包括其立体化学，以及分子内和分子间测量的物理合理性，如芳环的平面性、标准键长和蛋白质-配体冲突。只有通过这些检查并预测出类似天然结合模式的方法才应被归类为具有“最先进”的性能。我们使用PoseBusters来比较五种基于深度学习的对接方法（DeepDock、DiffDock、EquiBind、TankBind和Uni-Mol）以及两种成熟的标准对接方法（AutoDock Vina和CCDC Gold），有无使用分子力学力场进行额外的预测后能量最小化步骤。我们表明，无论是在物理合理性还是在泛化到与训练数据不同的示例的能力方面，尚无基于深度学习的方法优于经典对接工具。此外，我们发现分子力学力场包含深度学习方法中缺失的与对接相关的物理知识。PoseBusters允许从业者评估对接和分子生成方法，并可能激发改进基于深度学习的方法仍所需的新归纳偏差，这将有助于推动更准确和更现实预测的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2df/10901501/e53ea7c000b8/d3sc04185a-f1.jpg

相似文献

PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences.PoseBusters：基于人工智能的对接方法无法生成符合物理原理的构象，也无法推广到新序列。

Chem Sci. 2023 Dec 13;15(9):3130-3139. doi: 10.1039/d3sc04185a. eCollection 2024 Feb 28.

DSDP: A Blind Docking Strategy Accelerated by GPUs.DSDP：一种基于 GPU 的盲对接策略。

J Chem Inf Model. 2023 Jul 24;63(14):4355-4363. doi: 10.1021/acs.jcim.3c00519. Epub 2023 Jun 29.

A fully differentiable ligand pose optimization framework guided by deep learning and a traditional scoring function.一个由深度学习和传统评分函数引导的完全可微配体构象优化框架。

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac520.

EQUIBIND: A geometric deep learning-based protein-ligand binding prediction method.EQUIBIND：一种基于几何深度学习的蛋白质-配体结合预测方法。

Drug Discov Ther. 2023 Nov 18;17(5):363-364. doi: 10.5582/ddt.2023.01063. Epub 2023 Sep 26.

Boosted neural networks scoring functions for accurate ligand docking and ranking.用于精确配体对接和排序的增强神经网络评分函数。

J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4.

ViTScore: A Novel Three-Dimensional Vision Transformer Method for Accurate Prediction of Protein-Ligand Docking Poses.ViTScore：一种用于准确预测蛋白质-配体对接构象的新型三维视觉Transformer 方法。

IEEE Trans Nanobioscience. 2023 Oct;22(4):734-743. doi: 10.1109/TNB.2023.3274640. Epub 2023 Oct 3.

Benchmarked molecular docking integrated molecular dynamics stability analysis for prediction of SARS-CoV-2 papain-like protease inhibition by olive secoiridoids.基于基准的分子对接结合分子动力学稳定性分析用于预测橄榄裂环烯醚萜对SARS-CoV-2木瓜样蛋白酶的抑制作用

J King Saud Univ Sci. 2023 Jan;35(1):102402. doi: 10.1016/j.jksus.2022.102402. Epub 2022 Oct 30.

Combining Docking Pose Rank and Structure with Deep Learning Improves Protein-Ligand Binding Mode Prediction over a Baseline Docking Approach.结合对接构象排序和深度学习可提高基于对接方法的蛋白-配体结合模式预测。

J Chem Inf Model. 2020 Sep 28;60(9):4170-4179. doi: 10.1021/acs.jcim.9b00927. Epub 2020 Mar 3.

Rescoring of docking poses under Occam's Razor: are there simpler solutions?奥卡姆剃刀下对接构象的重评分：是否存在更简单的解决方案？

J Comput Aided Mol Des. 2018 Sep;32(9):877-888. doi: 10.1007/s10822-018-0155-5. Epub 2018 Sep 1.

DeepBSP-a Machine Learning Method for Accurate Prediction of Protein-Ligand Docking Structures.DeepBSP：一种用于准确预测蛋白质-配体对接结构的机器学习方法。

J Chem Inf Model. 2021 May 24;61(5):2231-2240. doi: 10.1021/acs.jcim.1c00334. Epub 2021 May 12.

引用本文的文献

Decoding the limits of deep learning in molecular docking for drug discovery.解码深度学习在药物发现分子对接中的局限性。

Chem Sci. 2025 Aug 19. doi: 10.1039/d5sc05395a.

Beyond rigid docking: deep learning approaches for fully flexible protein-ligand interactions.超越刚性对接：用于完全柔性蛋白质-配体相互作用的深度学习方法。

Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf454.

FlowMol3: Flow Matching for 3D De Novo Small-Molecule Generation.FlowMol3：用于三维从头小分子生成的流匹配

ArXiv. 2025 Aug 18:arXiv:2508.12629v1.

Target-aware 3D molecular generation based on guided equivariant diffusion.基于引导等变扩散的目标感知三维分子生成

Nat Commun. 2025 Aug 25;16(1):7928. doi: 10.1038/s41467-025-63245-0.

Accelerating Biomolecular Modeling with AtomWorks and RF3.利用AtomWorks和RF3加速生物分子建模

bioRxiv. 2025 Aug 15:2025.08.14.670328. doi: 10.1101/2025.08.14.670328.

Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction.基于分子动力学模拟的时空学习用于蛋白质-配体结合亲和力预测。

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf429.

Recent computational advances in the identification of cryptic binding sites for drug discovery.药物发现中隐秘结合位点识别的近期计算进展。

Bioinform Adv. 2025 Jul 1;5(1):vbaf156. doi: 10.1093/bioadv/vbaf156. eCollection 2025.

AlphaFold 3: an unprecedent opportunity for fundamental research and drug development.阿尔法折叠3：基础研究和药物开发的前所未有的机遇。

Precis Clin Med. 2025 Jul 1;8(3):pbaf015. doi: 10.1093/pcmedi/pbaf015. eCollection 2025 Sep.

A bottom-up approach to find lead compounds in expansive chemical spaces.一种在广阔化学空间中寻找先导化合物的自下而上方法。

Commun Chem. 2025 Aug 1;8(1):225. doi: 10.1038/s42004-025-01610-2.

Benchmarking 3D Structure-Based Molecule Generators.基于3D结构的分子生成器的基准测试

J Chem Inf Model. 2025 Aug 11;65(15):8006-8021. doi: 10.1021/acs.jcim.5c01020. Epub 2025 Jul 25.

本文引用的文献

Fragment Merging Using a Graph Database Samples Different Catalogue Space than Similarity Search.使用图数据库进行片段合并采样的目录空间与相似度搜索不同。

J Chem Inf Model. 2023 Jun 12;63(11):3423-3437. doi: 10.1021/acs.jcim.3c00276. Epub 2023 May 25.

Development and Benchmarking of Open Force Field 2.0.0: The Sage Small Molecule Force Field.开发与基准测试 Open Force Field 2.0.0：Sage 小分子力场

J Chem Theory Comput. 2023 Jun 13;19(11):3251-3275. doi: 10.1021/acs.jctc.3c00039. Epub 2023 May 11.

Simplified quality assessment for small-molecule ligands in the Protein Data Bank.小分子配体在蛋白质数据库中的简化质量评估。

Structure. 2022 Feb 3;30(2):252-262.e4. doi: 10.1016/j.str.2021.10.003. Epub 2022 Jan 12.

InChI version 1.06: now more than 99.99% reliable.国际化学标识符（InChI）版本1.06：目前可靠性超过99.99%。

J Cheminform. 2021 May 24;13(1):40. doi: 10.1186/s13321-021-00517-z.

Key Topics in Molecular Docking for Drug Design.药物设计中的分子对接关键主题。

Int J Mol Sci. 2019 Sep 15;20(18):4574. doi: 10.3390/ijms20184574.

Can We Still Trust Docking Results? An Extension of the Applicability of DockBench on PDBbind Database.我们还能相信对接结果吗？DockBench 在 PDBbind 数据库上适用性的扩展。

Int J Mol Sci. 2019 Jul 20;20(14):3558. doi: 10.3390/ijms20143558.

GuacaMol: Benchmarking Models for de Novo Molecular Design.GuacaMol：从头设计分子的模型基准测试。

J Chem Inf Model. 2019 Mar 25;59(3):1096-1108. doi: 10.1021/acs.jcim.8b00839. Epub 2019 Mar 19.

P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure.P2Rank：基于机器学习的工具，用于从蛋白质结构中快速准确地预测配体结合位点。

J Cheminform. 2018 Aug 14;10(1):39. doi: 10.1186/s13321-018-0285-8.

Benchmarking Commercial Conformer Ensemble Generators.商业构象异构体集合生成器的基准测试

J Chem Inf Model. 2017 Nov 27;57(11):2719-2728. doi: 10.1021/acs.jcim.7b00505. Epub 2017 Oct 18.

OpenMM 7: Rapid development of high performance algorithms for molecular dynamics.OpenMM 7：分子动力学高性能算法的快速开发。

PLoS Comput Biol. 2017 Jul 26;13(7):e1005659. doi: 10.1371/journal.pcbi.1005659. eCollection 2017 Jul.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PoseBusters：基于人工智能的对接方法无法生成符合物理原理的构象，也无法推广到新序列。

PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献