Wang Xiao, Li Guo-Zheng, Lu Wen-Cong
MOE Key Laboratory of Embedded System and Service Computing, Department of Control Science and Engineering, Tongji University, Shanghai, China.
Protein Pept Lett. 2013 Mar;20(3):309-17. doi: 10.2174/0929866511320030009.
Protein subcellular localization aims at predicting the location of a protein within a cell using computational methods. Knowledge of subcellular localization of viral proteins in a host cell or virus-infected cell is important because it is closely related to their destructive tendencies and consequences. Prediction of viral protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods specialized for viral proteins are only used to deal with the single-location proteins. To better reflect the characteristics of multiplex proteins, a new predictor, called Virus-ECC-mPLoc, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins by introducing a powerful multi-label learning approach which exploits correlations between subcellular locations and by hybridizing the gene ontology information with the dipeptide composition information. It can be utilized to identify viral proteins among the following six locations: (1) viral capsid, (2) host cell membrane, (3) host endoplasmic reticulum, (4) host cytoplasm, (5) host nucleus, and (6) secreted. Experimental results show that the overall success rates thus obtained by Virus-ECC-mPLoc are 86.9% for jackknife test and 87.2% for independent data set test, which are significantly higher than that by any of the existing predictors. As a user-friendly web-server, Virus-ECCmPLoc is freely accessible to the public at the web-site http://levis.tongji.edu.cn:8080/bioinfo/Virus-ECC-mPLoc/.
蛋白质亚细胞定位旨在使用计算方法预测蛋白质在细胞内的位置。了解病毒蛋白在宿主细胞或病毒感染细胞中的亚细胞定位很重要,因为这与它们的破坏倾向和后果密切相关。预测病毒蛋白亚细胞定位是一个重要但具有挑战性的问题,尤其是当蛋白质可能同时存在于两个或更多不同的亚细胞定位位点,或在这些位点之间移动时。大多数现有的专门用于病毒蛋白的蛋白质亚细胞定位方法仅用于处理单定位蛋白。为了更好地反映多定位蛋白的特征,已经开发了一种新的预测器,称为Virus-ECC-mPLoc,它可以通过引入一种强大的多标签学习方法来处理包含单定位和多定位蛋白的系统,该方法利用亚细胞定位之间的相关性,并将基因本体信息与二肽组成信息进行杂交。它可用于识别以下六个位置中的病毒蛋白:(1)病毒衣壳,(2)宿主细胞膜,(3)宿主内质网,(4)宿主细胞质,(5)宿主细胞核,以及(6)分泌型。实验结果表明,通过留一法检验,Virus-ECC-mPLoc获得的总体成功率为86.9%,独立数据集检验的成功率为87.2%,均显著高于任何现有预测器。作为一个用户友好的网络服务器,公众可通过网站http://levis.tongji.edu.cn:8080/bioinfo/Virus-ECC-mPLoc/免费访问Virus-ECCmPLoc。