Tang Xin, Lei Xiujuan, Liu Lian
School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
Interdiscip Sci. 2025 Jun 2. doi: 10.1007/s12539-025-00713-7.
With the advantages of reducing biochemical experiments and enabling the rapid screening of potential druggable compounds, accurate computational methods are essential for predicting Drug-Target affinity (DTA). Current deep learning-based DTA prediction methods predominantly concentrate on single-modal information from drugs or targets. In this article, we propose a new multi-modal DTA prediction method, MGSDTA, to integrate graph features and sequence features of drug molecules and target proteins. We extract features from the drug molecular graphs and target protein graphs, meanwhile, we extract sequence features using continuous embeddings generated by advanced self-supervised pre-trained models, Mol2vec and ProtVec, for drug substructures and target subsequences respectively. Finally, they are integrated with a weighted fusion module for DTA prediction. Experiments on benchmark datasets indicate that the performance of MGSDTA exceeds single-modal methods based solely on sequences or graphs.
由于具有减少生化实验以及能够快速筛选潜在可成药化合物的优点,精确的计算方法对于预测药物-靶点亲和力(DTA)至关重要。当前基于深度学习的DTA预测方法主要集中于来自药物或靶点的单模态信息。在本文中,我们提出了一种新的多模态DTA预测方法MGSDTA,以整合药物分子和靶蛋白的图形特征与序列特征。我们从药物分子图和靶蛋白图中提取特征,同时,我们分别使用由先进的自监督预训练模型Mol2vec和ProtVec生成的连续嵌入来提取药物子结构和靶标子序列的序列特征。最后,将它们与加权融合模块集成以进行DTA预测。在基准数据集上的实验表明,MGSDTA的性能超过了仅基于序列或图形的单模态方法。