Department of Information Engineering, Xijing University, Xi'an 710123, China.
Molecules. 2018 Apr 4;23(4):823. doi: 10.3390/molecules23040823.
Protein-protein interactions (PPIs) play important roles in various aspects of the structural and functional organization of cells; thus, detecting PPIs is one of the most important issues in current molecular biology. Although much effort has been devoted to using high-throughput techniques to identify protein-protein interactions, the experimental methods are both time-consuming and costly. In addition, they yield high rates of false positive and false negative results. In addition, most of the proposed computational methods are limited in information about protein homology or the interaction marks of the protein partners. In this paper, we report a computational method only using the information from protein sequences. The main improvements come from novel protein sequence representation by combing the continuous and discrete wavelet transforms and from adopting weighted sparse representation-based classifier (WSRC). The proposed method was used to predict PPIs from three different datasets: yeast, human and . In addition, we employed the prediction model trained on the PPIs dataset of yeast to predict the PPIs of six datasets of other species. To further evaluate the performance of the prediction model, we compared WSRC with the state-of-the-art support vector machine classifier. When predicting PPIs of yeast, humans and dataset, we obtained high average prediction accuracies of 97.38%, 98.92% and 93.93% respectively. In the cross-species experiments, most of the prediction accuracies are over 94%. These promising results show that the proposed method is indeed capable of obtaining higher performance in PPIs detection.
蛋白质-蛋白质相互作用 (PPIs) 在细胞的结构和功能组织的各个方面都起着重要作用;因此,检测蛋白质-蛋白质相互作用是当前分子生物学中最重要的问题之一。尽管已经投入了大量的努力使用高通量技术来识别蛋白质-蛋白质相互作用,但实验方法既耗时又昂贵。此外,它们产生了很高的假阳性和假阴性结果的比率。此外,大多数提出的计算方法都受到蛋白质同源性或蛋白质伙伴相互作用标记信息的限制。在本文中,我们报告了一种仅使用蛋白质序列信息的计算方法。主要的改进来自于通过组合连续和离散小波变换的新颖的蛋白质序列表示,以及采用加权稀疏表示分类器(WSRC)。所提出的方法用于从三个不同的数据集预测蛋白质-蛋白质相互作用:酵母、人类和。此外,我们使用在酵母蛋白质-蛋白质相互作用数据集上训练的预测模型来预测六个其他物种的蛋白质-蛋白质相互作用数据集的蛋白质-蛋白质相互作用。为了进一步评估预测模型的性能,我们将 WSRC 与最先进的支持向量机分类器进行了比较。在预测酵母、人类和数据集的蛋白质-蛋白质相互作用时,我们分别获得了 97.38%、98.92%和 93.93%的高平均预测准确率。在跨物种实验中,大多数预测准确率都超过了 94%。这些有希望的结果表明,所提出的方法确实能够在蛋白质-蛋白质相互作用检测中获得更高的性能。