School of Information Engineering, Xijing University, Xi'an 710123, China.
Molecules. 2019 Aug 19;24(16):2999. doi: 10.3390/molecules24162999.
The identification of drug-target interactions (DTIs) is a critical step in drug development. Experimental methods that are based on clinical trials to discover DTIs are time-consuming, expensive, and challenging. Therefore, as complementary to it, developing new computational methods for predicting novel DTI is of great significance with regards to saving cost and shortening the development period. In this paper, we present a novel computational model for predicting DTIs, which uses the sequence information of proteins and a rotation forest classifier. Specifically, all of the target protein sequences are first converted to a position-specific scoring matrix (PSSM) to retain evolutionary information. We then use local phase quantization (LPQ) descriptors to extract evolutionary information in the PSSM. On the other hand, substructure fingerprint information is utilized to extract the features of the drug. We finally combine the features of drugs and protein together to represent features of each drug-target pair and use a rotation forest classifier to calculate the scores of interaction possibility, for a global DTI prediction. The experimental results indicate that the proposed model is effective, achieving average accuracies of 89.15%, 86.01%, 82.20%, and 71.67% on four datasets (i.e., enzyme, ion channel, G protein-coupled receptors (GPCR), and nuclear receptor), respectively. In addition, we compared the prediction performance of the rotation forest classifier with another popular classifier, support vector machine, on the same dataset. Several types of methods previously proposed are also implemented on the same datasets for performance comparison. The comparison results demonstrate the superiority of the proposed method to the others. We anticipate that the proposed method can be used as an effective tool for predicting drug-target interactions on a large scale, given the information of protein sequences and drug fingerprints.
药物-靶点相互作用(DTI)的鉴定是药物开发的关键步骤。基于临床试验发现 DTI 的实验方法既耗时、昂贵又具有挑战性。因此,作为补充,开发用于预测新 DTI 的新计算方法对于节省成本和缩短开发周期具有重要意义。在本文中,我们提出了一种新的计算模型,用于预测 DTI,该模型使用蛋白质的序列信息和旋转森林分类器。具体来说,首先将所有目标蛋白序列转换为位置特异性评分矩阵(PSSM)以保留进化信息。然后,我们使用局部相位量化(LPQ)描述符提取 PSSM 中的进化信息。另一方面,利用亚结构指纹信息提取药物的特征。最后,我们将药物和蛋白质的特征结合起来,代表每个药物-靶点对的特征,并使用旋转森林分类器计算相互作用可能性的分数,以进行全局 DTI 预测。实验结果表明,该模型有效,在四个数据集(即酶、离子通道、G 蛋白偶联受体(GPCR)和核受体)上的平均准确率分别为 89.15%、86.01%、82.20%和 71.67%。此外,我们在同一数据集上将旋转森林分类器的预测性能与另一个流行的分类器支持向量机进行了比较。还在同一数据集上实现了几种以前提出的方法进行性能比较。比较结果表明,与其他方法相比,该方法具有优越性。我们预计,给定蛋白质序列和药物指纹的信息,该方法可以作为一种有效的大规模预测药物-靶点相互作用的工具。