School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou 325027, China.
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410012, China.
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae271.
Accurate inference of potential drug-protein interactions (DPIs) aids in understanding drug mechanisms and developing novel treatments. Existing deep learning models, however, struggle with accurate node representation in DPI prediction, limiting their performance.
We propose a new computational framework that integrates global and local features of nodes in the drug-protein bipartite graph for efficient DPI inference. Initially, we employ pre-trained models to acquire fundamental knowledge of drugs and proteins and to determine their initial features. Subsequently, the MinHash and HyperLogLog algorithms are utilized to estimate the similarity and set cardinality between drug and protein subgraphs, serving as their local features. Then, an energy-constrained diffusion mechanism is integrated into the transformer architecture, capturing interdependencies between nodes in the drug-protein bipartite graph and extracting their global features. Finally, we fuse the local and global features of nodes and employ multilayer perceptrons to predict the likelihood of potential DPIs. A comprehensive and precise node representation guarantees efficient prediction of unknown DPIs by the model. Various experiments validate the accuracy and reliability of our model, with molecular docking results revealing its capability to identify potential DPIs not present in existing databases. This approach is expected to offer valuable insights for furthering drug repurposing and personalized medicine research.
Our code and data are accessible at: https://github.com/ZZCrazy00/DPI.
准确推断潜在的药物-蛋白相互作用(DPIs)有助于理解药物机制和开发新的治疗方法。然而,现有的深度学习模型在 DPI 预测中难以准确表示节点,限制了它们的性能。
我们提出了一个新的计算框架,该框架整合了药物-蛋白二分图中节点的全局和局部特征,以实现高效的 DPI 推断。首先,我们使用预训练的模型来获取药物和蛋白质的基础知识,并确定它们的初始特征。然后,使用 MinHash 和 HyperLogLog 算法来估计药物和蛋白质子图之间的相似度和集合基数,作为它们的局部特征。然后,将能量约束的扩散机制集成到转换器架构中,以捕获药物-蛋白二分图中节点之间的相互依赖关系,并提取它们的全局特征。最后,我们融合节点的局部和全局特征,并使用多层感知机来预测潜在 DPIs 的可能性。全面而精确的节点表示保证了模型对未知 DPIs 的高效预测。各种实验验证了我们模型的准确性和可靠性,分子对接结果表明它能够识别现有数据库中不存在的潜在 DPIs。这种方法有望为药物再利用和个性化医学研究提供有价值的见解。
我们的代码和数据可在 https://github.com/ZZCrazy00/DPI 上获得。