Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium.
Computational Biomolecular Dynamics Group, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany.
J Chem Inf Model. 2024 Jul 8;64(13):5063-5076. doi: 10.1021/acs.jcim.4c00417. Epub 2024 Jun 19.
In drug discovery, the in silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations are nowadays a popular approach for the accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF, and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a consensus approach using Sage, GAFF, and CGenFF leads to accuracy comparable to OPLS3e. While Parsley and Sage are performing comparably based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy can not only be attributed to the force field parameters but is also dependent on input preparation and sampling convergence of the calculations. Especially large perturbations and nonconverged simulations lead to less accurate predictions. The input structures, Gromacs force field files, as well as the analysis Python notebooks are available on GitHub.
在药物发现中,基于计算机的结合亲和力预测是优先合成化合物的主要方法之一。基于分子动力学 (MD) 模拟的化学相对结合自由能 (RBFE) 计算是目前准确排列化合物亲和力的一种流行方法。MD 模拟依赖于经验力场参数,这些参数强烈影响预测亲和力的准确性。在这里,我们评估了六种不同小分子力场在一组 598 种配体和 22 种蛋白质靶标上的 RBFE 计算中预测实验蛋白质-配体结合亲和力的能力。公共力场 OpenFF Parsley 和 Sage、GAFF 和 CGenFF 表现出相当的准确性,而 OPLS3e 则明显更准确。然而,使用 Sage、GAFF 和 CGenFF 的共识方法导致的准确性可与 OPLS3e 相媲美。虽然 Parsley 和 Sage 在整个数据集的汇总统计数据方面表现相当,但在外点方面存在差异。对力场的分析表明,改进的参数可显著提高数据集子集(涉及这些参数)中亲和力预测的准确性。较低的准确性不仅归因于力场参数,还取决于计算的输入准备和采样收敛性。特别是大的扰动和未收敛的模拟会导致预测精度降低。输入结构、Gromacs 力场文件以及分析 Python 笔记本均可在 GitHub 上获得。
J Chem Inf Model. 2024-7-8
J Chem Inf Model. 2017-12-15
J Chem Theory Comput. 2023-6-13
Methods Mol Biol. 2019
J Comput Aided Mol Des. 2020-5
J Chem Theory Comput. 2025-8-26
J Chem Inf Model. 2025-7-14
J Chem Inf Model. 2025-4-28
Mol Biotechnol. 2024-10-27
J Phys Chem B. 2024-7-25
J Chem Theory Comput. 2024-5-28
J Chem Theory Comput. 2023-8-8
J Chem Theory Comput. 2023-6-13
J Chem Theory Comput. 2023-6-13
J Chem Inf Model. 2023-3-27
J Chem Theory Comput. 2023-3-28
J Chem Theory Comput. 2023-2-14
J Chem Theory Comput. 2023-1-9
J Chem Inf Model. 2022-11-28