Biotherapeutic and Medicinal Sciences, Biogen, 225 Binney Street, Cambridge, MA 02142, USA.
Molecules. 2024 Feb 13;29(4):830. doi: 10.3390/molecules29040830.
The rank ordering of ligands remains one of the most attractive challenges in drug discovery. While physics-based in silico binding affinity methods dominate the field, they still have problems, which largely revolve around forcefield accuracy and sampling. Recent advances in machine learning have gained traction for protein-ligand binding affinity predictions in early drug discovery programs. In this article, we perform retrospective binding free energy evaluations for 172 compounds from our internal collection spread over four different protein targets and five congeneric ligand series. We compared multiple state-of-the-art free energy methods ranging from physics-based methods with different levels of complexity and conformational sampling to state-of-the-art machine-learning-based methods that were available to us. Overall, we found that physics-based methods behaved particularly well when the ligand perturbations were made in the solvation region, and they did not perform as well when accounting for large conformational changes in protein active sites. On the other end, machine-learning-based methods offer a good cost-effective alternative for binding free energy calculations, but the accuracy of their predictions is highly dependent on the experimental data available for training the model.
配体的排序仍然是药物发现中最具吸引力的挑战之一。虽然基于物理的计算结合亲和力方法主导着该领域,但它们仍然存在问题,这些问题主要集中在力场的准确性和采样上。最近,机器学习在早期药物发现项目中对蛋白质-配体结合亲和力的预测取得了进展。在本文中,我们对来自我们内部收藏的 172 种化合物进行了回顾性的结合自由能评估,这些化合物分布在四个不同的蛋白质靶标和五个同类配体系列上。我们比较了多种最先进的自由能方法,从具有不同复杂程度和构象采样的基于物理的方法到我们可用的最先进的基于机器学习的方法。总的来说,我们发现基于物理的方法在溶剂化区域进行配体扰动时表现得特别好,而在考虑蛋白质活性位点的大构象变化时,它们的表现就不太好。另一方面,基于机器学习的方法为结合自由能计算提供了一种很好的具有成本效益的替代方案,但它们的预测准确性高度依赖于用于训练模型的实验数据。