College of Computer Science and Technology, China University of Petroleum, Qingdao 266555, China.
State Key Laboratory of Computer Architecture, Institute of Computing Technology, University of Chinese Academy of Sciences, Beijing 100080, China.
Int J Mol Sci. 2022 Mar 29;23(7):3780. doi: 10.3390/ijms23073780.
Identifying compound-protein (drug-target, DTI) interactions (CPI) accurately is a key step in drug discovery. Including virtual screening and drug reuse, it can significantly reduce the time it takes to identify drug candidates and provide patients with timely and effective treatment. Recently, more and more researchers have developed CPI's deep learning model, including feature representation of a 2D molecular graph of a compound using a graph convolutional neural network, but this method loses much important information about the compound. In this paper, we propose a novel three-channel deep learning framework, named SSGraphCPI, for CPI prediction, which is composed of recurrent neural networks with an attentional mechanism and graph convolutional neural network. In our model, the characteristics of compounds are extracted from 1D SMILES string and 2D molecular graph. Using both the 1D SMILES string sequence and the 2D molecular graph can provide both sequential and structural features for CPI predictions. Additionally, we select the 1D CNN module to learn the hidden data patterns in the sequence to mine deeper information. Our model is much more suitable for collecting more effective information of compounds. Experimental results show that our method achieves significant performances with RMSE (Root Mean Square Error) = 2.24 and R2 (degree of linear fitting of the model) = 0.039 on the GPCR (G Protein-Coupled Receptors) dataset, and with RMSE = 2.64 and R2 = 0.018 on the GPCR dataset RMSE, which preforms better than some classical deep learning models, including RNN/GCNN-CNN, GCNNet and GATNet.
准确识别化合物-蛋白质(药物靶点,DTI)相互作用(CPI)是药物发现的关键步骤。包括虚拟筛选和药物再利用,它可以显著缩短识别药物候选物的时间,并为患者提供及时有效的治疗。最近,越来越多的研究人员开发了 CPI 的深度学习模型,包括使用图卷积神经网络对化合物的 2D 分子图进行特征表示,但这种方法会丢失关于化合物的许多重要信息。在本文中,我们提出了一种新的用于 CPI 预测的三通道深度学习框架,名为 SSGraphCPI,它由具有注意力机制的递归神经网络和图卷积神经网络组成。在我们的模型中,从 1D SMILES 字符串和 2D 分子图中提取化合物的特征。使用 1D SMILES 字符串序列和 2D 分子图可以为 CPI 预测提供序列和结构特征。此外,我们选择 1D CNN 模块来学习序列中的隐藏数据模式,以挖掘更深层次的信息。我们的模型更适合收集更有效的化合物信息。实验结果表明,我们的方法在 GPCR(G 蛋白偶联受体)数据集上的 RMSE(均方根误差)= 2.24 和 R2(模型线性拟合度)= 0.039 上取得了显著的性能,在 GPCR 数据集上的 RMSE = 2.64 和 R2 = 0.018 上的表现优于一些经典的深度学习模型,包括 RNN/GCNN-CNN、GCNNet 和 GATNet。