Li Feng-Min, Li Qian-Zhong
Laboratory of Theoretical Biophysics, Department of Physics, College of Sciences and Technology, Inner Mongolia University, Hohhot 010021, China.
Protein Pept Lett. 2008;15(6):612-6. doi: 10.2174/092986608784966930.
The location of a protein in a cell is closely correlated with its biological function. Based on the concept that the protein subcellular location is mainly determined by its amino acid and pseudo amino acid composition (PseAA), a new algorithm of increment of diversity combined with support vector machine is proposed to predict the protein subcellular location. The subcellular locations of plant and non-plant proteins are investigated by our method. The overall prediction accuracies in jackknife test are 88.3% for the eukaryotic plant proteins and 92.4% for the eukaryotic non-plant proteins, respectively. In order to estimate the effect of the sequence identity on predictive result, the proteins with sequence identity <or=40% are selected. The overall success rates of prediction are 86.2% and 92.3% for plant and non-plant proteins in jackknife test, respectively.
蛋白质在细胞中的位置与其生物学功能密切相关。基于蛋白质亚细胞定位主要由其氨基酸和伪氨基酸组成(PseAA)决定这一概念,提出了一种结合支持向量机的多样性增量新算法来预测蛋白质亚细胞定位。通过我们的方法研究了植物和非植物蛋白质的亚细胞定位。在留一法检验中,真核植物蛋白质的总体预测准确率分别为88.3%,真核非植物蛋白质的总体预测准确率为92.4%。为了评估序列同一性对预测结果的影响,选择了序列同一性≤40%的蛋白质。在留一法检验中,植物和非植物蛋白质的总体预测成功率分别为86.2%和92.3%。