Xie Zhiqi, Zhang Peng, Fan Zipeng, Zhang Qingpeng, Lin Qianxi
College of Intelligence and Computing, Tianjin University, Tianjin 300072, China.
Musketeers Foundation Institute of Data Science and the Department of Pharmacology and Pharmacy, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China.
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf298.
Predicting protein-ligand binding affinity accurately and quickly is a major challenge in drug discovery. Recent advancements suggest that deep learning-based computational methods can effectively quantify binding affinity, making them a promising alternative. Environmental factors significantly influence the interactions between protein pockets and ligands, affecting the binding strength. However, many existing deep learning approaches tend to overlook these environmental effects, focusing instead on extracting features from proteins and ligands based solely on their sequences or structures.
We propose a deep learning method, EM-PLA, which is based on an environment-aware heterogeneous graph neural network and utilizes multimodal data. This method improves protein-ligand binding affinity prediction by incorporating environmental information derived from the biochemical properties of proteins and ligands. Specifically, EM-PLA employs a heterogeneous graph neural network (HGT) with environmental information to improve the calculation of non-covalent interactions, while also considering the interaction calculations between protein sequences and ligand sequences. We evaluate the performance of the proposed EM-PLA through comprehensive benchmark experiments for binding affinity prediction, demonstrating its superior performance and generalization capability compared to state-of-the-art baseline methods. Furthermore, by analyzing the results of the ablation experiments and integrating visual analyses and case studies, we validate the rationale of the proposed method. These results indicate that EM-PLA is an effective method for binding affinity prediction and may provide valuable insights for future applications.
The source code is available at https://github.com/littlemou22/EM-PLA.
准确快速地预测蛋白质-配体结合亲和力是药物发现中的一项重大挑战。最近的进展表明,基于深度学习的计算方法可以有效地量化结合亲和力,使其成为一种有前途的替代方法。环境因素会显著影响蛋白质口袋与配体之间的相互作用,进而影响结合强度。然而,许多现有的深度学习方法往往忽略了这些环境影响,而是仅基于蛋白质和配体的序列或结构来提取特征。
我们提出了一种深度学习方法EM-PLA,它基于环境感知异构图神经网络并利用多模态数据。该方法通过纳入源自蛋白质和配体生化特性的环境信息来改进蛋白质-配体结合亲和力预测。具体而言,EM-PLA采用带有环境信息的异构图神经网络(HGT)来改进非共价相互作用的计算,同时还考虑蛋白质序列与配体序列之间的相互作用计算。我们通过结合亲和力预测的综合基准实验评估了所提出的EM-PLA的性能,证明了其与现有最先进的基线方法相比具有卓越的性能和泛化能力。此外,通过分析消融实验的结果并结合可视化分析和案例研究,我们验证了所提出方法的原理。这些结果表明,EM-PLA是一种有效的结合亲和力预测方法,可能为未来的应用提供有价值的见解。