School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Buk-ku, Gwangju, Republic of Korea.
PLoS Comput Biol. 2019 Jun 14;15(6):e1007129. doi: 10.1371/journal.pcbi.1007129. eCollection 2019 Jun.
Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors have been shown to not be sufficiently informative to predict accurate DTIs. Thus, in this study, we propose a deep learning based DTI prediction model capturing local residue patterns of proteins participating in DTIs. When we employ a convolutional neural network (CNN) on raw protein sequences, we perform convolution on various lengths of amino acids subsequences to capture local residue patterns of generalized protein classes. We train our model with large-scale DTI information and demonstrate the performance of the proposed model using an independent dataset that is not seen during the training phase. As a result, our model performs better than previous protein descriptor-based models. Also, our model performs better than the recently developed deep learning models for massive prediction of DTIs. By examining pooled convolution results, we confirmed that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches. Our code is available at https://github.com/GIST-CSBL/DeepConv-DTI.
鉴定药物-靶标相互作用(DTIs)在药物发现中起着关键作用。体外和体内实验的高成本和劳动密集型性质突出了基于计算的 DTI 预测方法的重要性。在几个计算模型中,常规的蛋白质描述符已被证明不足以提供准确预测 DTI 的信息。因此,在这项研究中,我们提出了一种基于深度学习的 DTI 预测模型,该模型可以捕获参与 DTI 的蛋白质的局部残基模式。当我们在原始蛋白质序列上使用卷积神经网络(CNN)时,我们对各种长度的氨基酸子序列进行卷积,以捕获广义蛋白质类别的局部残基模式。我们使用大规模的 DTI 信息来训练我们的模型,并使用在训练阶段未见过的独立数据集来演示所提出模型的性能。结果表明,我们的模型比以前基于蛋白质描述符的模型表现更好。此外,我们的模型比最近开发的用于大规模 DTI 预测的深度学习模型表现更好。通过检查汇集的卷积结果,我们证实我们的模型可以检测 DTI 中蛋白质的结合位点。总之,我们用于检测靶蛋白局部残基模式的预测模型成功地丰富了原始蛋白质序列的蛋白质特征,比以前的方法产生了更好的预测结果。我们的代码可在 https://github.com/GIST-CSBL/DeepConv-DTI 获得。