Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Hefei, China.
Key Laboratory of Intelligent Computing and Signal Processing, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Hefei, China.
Brief Funct Genomics. 2024 May 15;23(3):286-294. doi: 10.1093/bfgp/elad037.
The precise identification of drug-protein inter action (DPI) can significantly speed up the drug discovery process. Bioassay methods are time-consuming and expensive to screen for each pair of drug proteins. Machine-learning-based methods cannot accurately predict a large number of DPIs. Compared with traditional computing methods, deep learning methods need less domain knowledge and have strong data learning ability. In this study, we construct a DPI prediction model based on dual channel neural networks with an efficient path attention mechanism, called DCA-DPI. The drug molecular graph and protein sequence are used as the data input of the model, and the residual graph neural network and the residual convolution network are used to learn the feature representation of the drug and protein, respectively, to obtain the feature vector of the drug and the hidden vector of protein. To get a more accurate protein feature vector, the weighted sum of the hidden vector of protein is applied using the neural attention mechanism. In the end, drug and protein vectors are concatenated and input into the full connection layer for classification. In order to evaluate the performance of DCA-DPI, three widely used public data, Human, C.elegans and DUD-E, are used in the experiment. The evaluation metrics values in the experiment are superior to other relevant methods. Experiments show that our model is efficient for DPI prediction.
药物-蛋白相互作用(DPI)的精确识别可以显著加快药物发现的过程。生物测定方法需要耗费大量时间和资金来筛选每对药物蛋白。基于机器学习的方法无法准确预测大量的 DPIs。与传统的计算方法相比,深度学习方法需要较少的领域知识,并且具有强大的数据学习能力。在本研究中,我们构建了一个基于双通道神经网络的 DPI 预测模型,该模型具有有效的路径注意机制,称为 DCA-DPI。药物分子图和蛋白质序列被用作模型的输入数据,残差图神经网络和残差卷积网络分别用于学习药物和蛋白质的特征表示,以获取药物的特征向量和蛋白质的隐藏向量。为了获得更准确的蛋白质特征向量,使用神经注意机制对蛋白质的隐藏向量进行加权求和。最后,将药物和蛋白质向量串联并输入全连接层进行分类。为了评估 DCA-DPI 的性能,实验中使用了三个广泛使用的公共数据集,即 Human、C.elegans 和 DUD-E。实验中的评估指标值优于其他相关方法。实验表明,我们的模型在 DPI 预测方面是高效的。