Jin Yu-Huan, Niu Bing, Feng Kai-Yan, Lu Wen-Cong, Cai Yu-Dong, Li Guo-Zheng
Department of Chemistry, College of Sciences, Shanghai University, 99 Shang-Da Road, Shanghai, China 200444.
Protein Pept Lett. 2008;15(3):286-9. doi: 10.2174/092986608783744234.
Protein subcellular localization, which tells where a protein resides in a cell, is an important characteristic of a protein, and relates closely to the function of proteins. The prediction of their subcellular localization plays an important role in the prediction of protein function, genome annotation and drug design. Therefore, it is an important and challenging role to predict subcellular localization using bio-informatics approach. In this paper, a robust predictor, AdaBoost Learner is introduced to predict protein subcellular localization based on its amino acid composition. Jackknife cross-validation and independent dataset test were used to demonstrate that Adaboost is a robust and efficient model in predicting protein subcellular localization. As a result, the correct prediction rates were 74.98% and 80.12% for the Jackknife test and independent dataset test respectively, which are higher than using other existing predictors. An online server for predicting subcellular localization of proteins based on AdaBoost classifier was available on http://chemdata.shu. edu.cn/sl12.
蛋白质亚细胞定位可表明蛋白质在细胞中的位置,是蛋白质的一项重要特性,且与蛋白质的功能密切相关。预测蛋白质的亚细胞定位在蛋白质功能预测、基因组注释和药物设计中发挥着重要作用。因此,利用生物信息学方法预测亚细胞定位是一项重要且具有挑战性的任务。本文引入了一种强大的预测器——AdaBoost学习器,基于氨基酸组成来预测蛋白质亚细胞定位。使用留一法交叉验证和独立数据集测试来证明AdaBoost在预测蛋白质亚细胞定位方面是一个强大且高效的模型。结果,留一法测试和独立数据集测试的正确预测率分别为74.98%和80.12%,高于使用其他现有预测器的结果。基于AdaBoost分类器的蛋白质亚细胞定位预测在线服务器可在http://chemdata.shu.edu.cn/sl12上获取。