Wang Yan-Bin, You Zhu-Hong, Li Xiao, Jiang Tong-Hai, Chen Xing, Zhou Xi, Wang Lei
Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China.
Mol Biosyst. 2017 Jun 27;13(7):1336-1344. doi: 10.1039/c7mb00188f.
Protein-protein interactions (PPIs) play an important role in most of the biological processes. How to correctly and efficiently detect protein interaction is a problem that is worth studying. Although high-throughput technologies provide the possibility to detect large-scale PPIs, these cannot be used to detect whole PPIs, and unreliable data may be generated. To solve this problem, in this study, a novel computational method was proposed to effectively predict the PPIs using the information of a protein sequence. The present method adopts Zernike moments to extract the protein sequence feature from a position specific scoring matrix (PSSM). Then, these extracted features were reconstructed using the stacked autoencoder. Finally, a novel probabilistic classification vector machine (PCVM) classifier was employed to predict the protein-protein interactions. When performed on the PPIs datasets of Yeast and H. pylori, the proposed method could achieve average accuracies of 96.60% and 91.19%, respectively. The promising result shows that the proposed method has a better ability to detect PPIs than other detection methods. The proposed method was also applied to predict PPIs on other species, and promising results were obtained. To evaluate the ability of our method, we compared it with the-state-of-the-art support vector machine (SVM) classifier for the Yeast dataset. The results obtained via multiple experiments prove that our method is powerful, efficient, feasible, and make a great contribution to proteomics research.
蛋白质-蛋白质相互作用(PPIs)在大多数生物过程中发挥着重要作用。如何正确、高效地检测蛋白质相互作用是一个值得研究的问题。尽管高通量技术为检测大规模PPIs提供了可能性,但这些技术无法用于检测所有的PPIs,并且可能会产生不可靠的数据。为了解决这个问题,在本研究中,提出了一种新颖的计算方法,利用蛋白质序列信息有效地预测PPIs。本方法采用泽尼克矩从位置特异性得分矩阵(PSSM)中提取蛋白质序列特征。然后,使用堆叠自动编码器对这些提取的特征进行重构。最后,采用一种新颖的概率分类向量机(PCVM)分类器来预测蛋白质-蛋白质相互作用。在酵母和幽门螺杆菌的PPIs数据集上进行测试时,所提出的方法分别能够达到96.60%和91.19%的平均准确率。这一令人满意的结果表明,所提出的方法在检测PPIs方面比其他检测方法具有更好的能力。所提出的方法还被应用于预测其他物种的PPIs,并获得了令人满意的结果。为了评估我们方法的能力,我们将其与用于酵母数据集的最先进的支持向量机(SVM)分类器进行了比较。通过多次实验获得的结果证明,我们的方法强大、高效、可行,为蛋白质组学研究做出了巨大贡献。