Suppr超能文献

用于结合亲和力预测的距离加注意力机制

Distance plus attention for binding affinity prediction.

作者信息

Rahman Julia, Newton M A Hakim, Ali Mohammed Eunus, Sattar Abdul

机构信息

School of Information and Communication Technology, Griffith University, 170 Kessels Rd, Nathan, 4111, QLD, Australia.

Institute for Integrated and Intelligent Systems (IIIS), Griffith University, 170 Kessels Rd, Nathan, 4111, QLD, Australia.

出版信息

J Cheminform. 2024 May 12;16(1):52. doi: 10.1186/s13321-024-00844-x.

Abstract

Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and -stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap. Scientific Contribution StatementThis study innovatively introduces distance-based features to predict protein-ligand binding affinity, capitalizing on unique molecular interactions. Furthermore, the incorporation of protein sequence features of specific residues enhances the model's proficiency in capturing intricate binding patterns. The predictive capabilities are further strengthened through the use of a deep learning architecture with attention mechanisms, and an ensemble approach, averaging the outputs of five models, is implemented to ensure robust and reliable predictions.

摘要

蛋白质-配体结合亲和力在药物开发中起着关键作用,尤其是在识别与目标疾病相关蛋白质的潜在配体方面。准确的亲和力预测可以显著减少药物开发所涉及的时间和成本。然而,高精度的亲和力预测仍然是一个研究挑战。提高亲和力预测的关键是有效捕捉蛋白质与配体之间的相互作用。现有的基于深度学习的计算方法使用3D网格、4D张量、分子图或基于邻近度的邻接矩阵,这些方法要么资源密集,要么不能直接表示潜在的相互作用。在本文中,我们提出了基于供体-受体关系、疏水性和π-堆积原子的原子级距离特征和注意力机制,以更好地捕捉特定的蛋白质-配体相互作用。我们认为,距离包含了短程直接和长程间接相互作用效应,而注意力机制则捕捉相互作用效应的水平。在非常著名的CASF-2016数据集上,我们提出的名为“距离加注意力亲和力预测”(DAAP)的方法,通过实现相关系数(R)0.909、均方根误差(RMSE)0.987、平均绝对误差(MAE)0.745、标准差(SD)0.988和一致性指数(CI)0.876,显著优于现有方法。该方法在其他五个基准数据集上也有显著改进,提高了约2%至37%。该程序和数据可在网站https://gitlab.com/mahnewton/daap上公开获取。科学贡献声明本研究创新性地引入基于距离的特征来预测蛋白质-配体结合亲和力,利用独特的分子相互作用。此外,纳入特定残基的蛋白质序列特征提高了模型捕捉复杂结合模式的能力。通过使用带有注意力机制的深度学习架构进一步增强了预测能力,并采用了集成方法,对五个模型的输出进行平均,以确保预测的稳健性和可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a187/11089753/724078ff81d2/13321_2024_844_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验