AIEN Institute, Shanghai Ocean University, Shanghai 201306, China.
College of Sciences & Engineering, University of Tasmania, 7001 Tasmania, Australia.
Int J Mol Sci. 2019 May 11;20(9):2344. doi: 10.3390/ijms20092344.
To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on using the protein overlapping property matrix (POPM) for predicting apoptosis protein subcellular location. Next, a 1000-dimensional feature vector is built to represent a protein. Finally, with the help of support vector machine-recursive feature elimination (SVM-RFE), we select the optimal features and put them into a support vector machine (SVM) classifier for predictions. The results of jackknife tests on two benchmark datasets demonstrate that our proposed method can achieve satisfactory prediction performance level with less computing capacity required and could work as a promising tool to predict the subcellular locations of apoptosis proteins.
为了揭示细胞程序性死亡的工作模式,了解凋亡蛋白的亚细胞定位是必不可少的。除了实验测定这种昂贵且耗时的方法外,近年来,主要集中在蛋白质序列表示技术创新和分类算法选择上的计算定位方案的研究也变得很流行。在这项研究中,提出了一种新的三字符编码模型,该模型基于使用蛋白质重叠特性矩阵(POPM)来预测凋亡蛋白的亚细胞定位。接下来,构建一个 1000 维的特征向量来表示蛋白质。最后,借助支持向量机递归特征消除(SVM-RFE),选择最优特征并将其放入支持向量机(SVM)分类器中进行预测。在两个基准数据集的 Jackknife 测试结果表明,我们提出的方法可以在需要较少计算能力的情况下达到令人满意的预测性能水平,可以作为一种很有前途的工具来预测凋亡蛋白的亚细胞定位。