Li F-M, Li Q-Z
Laboratory of Theoretical Biophysics, Department of Physics, College of Sciences and Technology, Inner Mongolia University, Hohhot, China.
Amino Acids. 2008 Jan;34(1):119-25. doi: 10.1007/s00726-007-0545-9. Epub 2007 May 21.
The subnuclear localization of nuclear protein is very important for in-depth understanding of the construction and function of the nucleus. Based on the amino acid and pseudo amino acid composition (PseAA) as originally introduced by K. C. Chou can incorporate much more information of a protein sequence than the classical amino acid composition so as to significantly enhance the power of using a discrete model to predict various attributes of a protein, an algorithm of increment of diversity combined with the improved quadratic discriminant analysis is proposed to predict the protein subnuclear location. The overall predictive success rates and correlation coefficient are 75.4% and 0.629 for 504 single localization proteins in jackknife test, and 80.4% for an independent set of 92 multi-localization proteins, respectively. For 406 single localization nuclear proteins with < or =25% sequence identity, the results of jackknife test show that the overall accuracy of prediction is 77.1%.
核蛋白的亚核定位对于深入理解细胞核的结构和功能非常重要。基于K.C.周最初提出的氨基酸和伪氨基酸组成(PseAA),其能够比经典氨基酸组成纳入更多蛋白质序列信息,从而显著增强使用离散模型预测蛋白质各种属性的能力,提出了一种结合改进二次判别分析的多样性增量算法来预测蛋白质亚核定位。在留一法检验中,504个单定位蛋白的总体预测成功率和相关系数分别为75.4%和0.629,对于92个多定位蛋白的独立集,预测成功率为80.4%。对于406个序列同一性≤25%的单定位核蛋白,留一法检验结果表明预测的总体准确率为77.1%。