Kim Hyojin, Shim Heesung, Ranganath Aditya, He Stewart, Stevenson Garrett, Allen Jonathan E
Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, United States.
Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, United States.
Front Pharmacol. 2025 Jan 3;15:1518875. doi: 10.3389/fphar.2024.1518875. eCollection 2024.
Recent advances in 3D structure-based deep learning approaches demonstrate improved accuracy in predicting protein-ligand binding affinity in drug discovery. These methods complement physics-based computational modeling such as molecular docking for virtual high-throughput screening. Despite recent advances and improved predictive performance, most methods in this category primarily rely on utilizing co-crystal complex structures and experimentally measured binding affinities as both input and output data for model training. Nevertheless, co-crystal complex structures are not readily available and the inaccurate predicted structures from molecular docking can degrade the accuracy of the machine learning methods.
We introduce a novel structure-based inference method utilizing multiple molecular docking poses for each complex entity. Our proposed method employs multi-instance learning with an attention network to predict binding affinity from a collection of docking poses.
We validate our method using multiple datasets, including PDBbind and compounds targeting the main protease of SARS-CoV-2. The results demonstrate that our method leveraging docking poses is competitive with other state-of-the-art inference models that depend on co-crystal structures.
This method offers binding affinity prediction without requiring co-crystal structures, thereby increasing its applicability to protein targets lacking such data.
基于3D结构的深度学习方法的最新进展表明,在药物发现中预测蛋白质-配体结合亲和力的准确性有所提高。这些方法补充了基于物理的计算建模,如用于虚拟高通量筛选的分子对接。尽管有最新进展且预测性能有所改善,但此类方法中的大多数主要依赖于利用共晶复合物结构和实验测量的结合亲和力作为模型训练的输入和输出数据。然而,共晶复合物结构并不容易获得,并且分子对接产生的预测结构不准确会降低机器学习方法的准确性。
我们引入了一种新颖的基于结构的推理方法,为每个复合物实体利用多个分子对接姿态。我们提出的方法采用多实例学习和注意力网络,从对接姿态集合中预测结合亲和力。
我们使用多个数据集验证了我们的方法,包括PDBbind和针对SARS-CoV-2主要蛋白酶的化合物。结果表明,我们利用对接姿态的方法与其他依赖共晶结构的先进推理模型具有竞争力。
该方法无需共晶结构即可进行结合亲和力预测,从而提高了其对缺乏此类数据的蛋白质靶点的适用性。