University of Chinese Academy of Sciences, Beijing 100049, China.
Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China.
Int J Biol Sci. 2018 May 23;14(8):983-991. doi: 10.7150/ijbs.23817. eCollection 2018.
Self-interacting proteins (SIPs) play a significant role in the execution of most important molecular processes in cells, such as signal transduction, gene expression regulation, immune response and enzyme activation. Although the traditional experimental methods can be used to generate SIPs data, it is very expensive and time-consuming based only on biological technique. Therefore, it is important and urgent to develop an efficient computational method for SIPs detection. In this study, we present a novel SIPs identification method based on machine learning technology by combing the Zernike Moments (ZMs) descriptor on Position Specific Scoring Matrix (PSSM) with Probabilistic Classification Vector Machines (PCVM) and Stacked Sparse Auto-Encoder (SSAE). More specifically, an efficient feature extraction technique called ZMs is firstly utilized to generate feature vectors on Position Specific Scoring Matrix (PSSM); Then, Deep neural network is employed for reducing the feature dimensions and noise; Finally, the Probabilistic Classification Vector Machine is used to execute the classification. The prediction performance of the proposed method is evaluated on and SIPs datasets via cross-validation. The experimental results indicate that the proposed method can achieve good accuracies of 92.55% and 97.47%, respectively. To further evaluate the advantage of our scheme for SIPs prediction, we also compared the PCVM classifier with the Support Vector Machine (SVM) and other existing techniques on the same data sets. Comparison results reveal that the proposed strategy is outperforms other methods and could be a used tool for identifying SIPs.
自相互作用蛋白 (SIPs) 在细胞中执行大多数重要的分子过程中发挥着重要作用,例如信号转导、基因表达调控、免疫反应和酶激活。虽然传统的实验方法可用于生成 SIPs 数据,但仅基于生物技术,其成本非常高且耗时。因此,开发一种有效的 SIPs 检测计算方法非常重要和紧迫。在这项研究中,我们提出了一种基于机器学习技术的新型 SIPs 识别方法,该方法结合了位置特异性评分矩阵 (PSSM) 上的 Zernike 矩 (ZMs) 描述符与概率分类向量机 (PCVM) 和堆叠稀疏自动编码器 (SSAE)。更具体地说,首先利用一种称为 ZMs 的高效特征提取技术,在位置特异性评分矩阵 (PSSM) 上生成特征向量;然后,采用深度神经网络来降低特征维度和噪声;最后,使用概率分类向量机进行分类。通过交叉验证,在 和 数据集上评估了所提出方法的预测性能。实验结果表明,所提出的方法分别可以达到 92.55%和 97.47%的良好精度。为了进一步评估我们的 SIPs 预测方案的优势,我们还在相同的数据集中将 PCVM 分类器与支持向量机 (SVM) 和其他现有技术进行了比较。比较结果表明,所提出的策略优于其他方法,可作为识别 SIPs 的工具。