School of Biological Science and Medical Engineering, Southeast University, Nanjing, China.
Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Melbourne, Australia.
Comput Biol Med. 2024 May;173:108339. doi: 10.1016/j.compbiomed.2024.108339. Epub 2024 Mar 18.
The application of Artificial Intelligence (AI) to screen drug molecules with potential therapeutic effects has revolutionized the drug discovery process, with significantly lower economic cost and time consumption than the traditional drug discovery pipeline. With the great power of AI, it is possible to rapidly search the vast chemical space for potential drug-target interactions (DTIs) between candidate drug molecules and disease protein targets. However, only a small proportion of molecules have labelled DTIs, consequently limiting the performance of AI-based drug screening. To solve this problem, a machine learning-based approach with great ability to generalize DTI prediction across molecules is desirable. Many existing machine learning approaches for DTI identification failed to exploit the full information with respect to the topological structures of candidate molecules. To develop a better approach for DTI prediction, we propose GraphormerDTI, which employs the powerful Graph Transformer neural network to model molecular structures. GraphormerDTI embeds molecular graphs into vector-format representations through iterative Transformer-based message passing, which encodes molecules' structural characteristics by node centrality encoding, node spatial encoding and edge encoding. With a strong structural inductive bias, the proposed GraphormerDTI approach can effectively infer informative representations for out-of-sample molecules and as such, it is capable of predicting DTIs across molecules with an exceptional performance. GraphormerDTI integrates the Graph Transformer neural network with a 1-dimensional Convolutional Neural Network (1D-CNN) to extract the drugs' and target proteins' representations and leverages an attention mechanism to model the interactions between them. To examine GraphormerDTI's performance for DTI prediction, we conduct experiments on three benchmark datasets, where GraphormerDTI achieves a superior performance than five state-of-the-art baselines for out-of-molecule DTI prediction, including GNN-CPI, GNN-PT, DeepEmbedding-DTI, MolTrans and HyperAttentionDTI, and is on a par with the best baseline for transductive DTI prediction. The source codes and datasets are publicly accessible at https://github.com/mengmeng34/GraphormerDTI.
人工智能(AI)在筛选具有潜在治疗效果的药物分子方面的应用,彻底改变了药物发现的过程,与传统的药物发现途径相比,其经济成本和时间消耗都显著降低。借助 AI 的强大功能,可以快速搜索广阔的化学空间,寻找候选药物分子与疾病靶标蛋白之间潜在的药物-靶标相互作用(DTI)。然而,只有一小部分分子具有标记的 DTI,因此限制了基于 AI 的药物筛选的性能。为了解决这个问题,需要一种具有强大跨分子概括 DTI 预测能力的基于机器学习的方法。许多现有的用于 DTI 识别的机器学习方法都未能充分利用候选分子拓扑结构的全部信息。为了开发更好的 DTI 预测方法,我们提出了 GraphormerDTI,它采用强大的图 Transformer 神经网络来模拟分子结构。GraphormerDTI 通过基于 Transformer 的迭代消息传递将分子图嵌入到向量格式表示中,通过节点中心性编码、节点空间编码和边编码对分子的结构特征进行编码。通过强大的结构归纳偏差,所提出的 GraphormerDTI 方法可以有效地推断出样本外分子的信息表示,因此能够以出色的性能预测跨分子的 DTI。GraphormerDTI 将图 Transformer 神经网络与一维卷积神经网络(1D-CNN)集成在一起,提取药物和靶蛋白的表示,并利用注意力机制来模拟它们之间的相互作用。为了检验 GraphormerDTI 在 DTI 预测方面的性能,我们在三个基准数据集上进行了实验,GraphormerDTI 在分子外 DTI 预测方面的性能优于五个最先进的基线,包括 GNN-CPI、GNN-PT、DeepEmbedding-DTI、MolTrans 和 HyperAttentionDTI,并且与转导 DTI 预测的最佳基线相当。源代码和数据集可在 https://github.com/mengmeng34/GraphormerDTI 上公开获取。