Wang Yanbin, You Zhuhong, Li Xiao, Chen Xing, Jiang Tonghai, Zhang Jingting
Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
Int J Mol Sci. 2017 May 11;18(5):1029. doi: 10.3390/ijms18051029.
Protein-protein interactions (PPIs) are essential for most living organisms' process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of and , the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.
蛋白质-蛋白质相互作用(PPIs)对于大多数生物体的生命过程至关重要。因此,检测PPIs对于理解生物系统的分子机制极为重要。尽管通过高通量技术已为多种生物体生成了许多PPIs数据,但整个原子间相互作用数据仍远未完整。此外,用于检测PPIs的高通量技术存在一些不可避免的缺陷,包括耗时、成本高和错误率高。近年来,随着机器学习的发展,计算方法已被广泛用于预测PPIs,并能取得良好的预测率。在本文中,我们提出了PCVMZM,这是一种基于概率分类向量机(PCVM)模型和泽尼克矩(ZM)描述符的计算方法,用于从蛋白质氨基酸序列预测PPIs。具体而言,泽尼克矩(ZM)描述符用于从位置特异性迭代基本局部比对搜索工具(PSI-BLAST)生成的位置特异性得分矩阵(PSSM)中提取蛋白质进化信息。然后,使用PCVM分类器推断蛋白质之间的相互作用。在[具体数据集1]和[具体数据集2]的PPIs数据集上进行测试时,所提出的方法分别可实现94.48%和91.25%的平均预测准确率。为了进一步评估所提出方法的性能,使用了最先进的支持向量机(SVM)分类器并与PCVM模型进行比较。在[具体数据集]上的实验结果表明,PCVM分类器的性能优于SVM分类器。实验结果表明,我们提出的方法是稳健、强大且可行的,可作为蛋白质组学研究的有用工具。