Jin Lixia, Tang Huanwen, Fang Weiwu
Bioinformatics and Computational Biology, Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50010, USA.
J Bioinform Comput Biol. 2005 Aug;3(4):915-27. doi: 10.1142/s0219720005001399.
Given a raw protein sequence, knowing its subcellular location is an important step toward understanding its function and designing further experiments. A novel method is proposed for the prediction of protein subcellular locations from sequences. For four categories of eukaryotic proteins the overall predictive accuracy is 82.0%, 2.6% higher than that by using SVM approach. For three subcellular locations of prokaryotic proteins, an overall accuracy of 89.9% is obtained. In accordance with the architecture of cells, a hierarchical prediction approach is designed. Based on amino acid composition extracellular proteins and intracellular proteins can be identified with accuracy of 97%.
给定一个原始蛋白质序列,了解其亚细胞定位是朝着理解其功能和设计进一步实验迈出的重要一步。本文提出了一种从序列预测蛋白质亚细胞定位的新方法。对于四类真核蛋白质,总体预测准确率为82.0%,比使用支持向量机方法高出2.6%。对于原核蛋白质的三个亚细胞定位,总体准确率达到89.9%。根据细胞结构,设计了一种分层预测方法。基于氨基酸组成,细胞外蛋白质和细胞内蛋白质的识别准确率可达97%。