Jia Peilin, Qian Ziliang, Zeng Zhenbin, Cai Yudong, Li Yixue
Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China.
Biochem Biophys Res Commun. 2007 Jun 1;357(2):366-70. doi: 10.1016/j.bbrc.2007.03.139. Epub 2007 Apr 2.
Assigning subcellular localization (SL) to proteins is one of the major tasks of functional proteomics. Despite the impressive technical advances of the past decades, it is still time-consuming and laborious to experimentally determine SL on a high throughput scale. Thus, computational predictions are the preferred method for large-scale assignment of protein SL, and if appropriate, followed up by experimental studies. In this report, using a machine learning approach, the Nearest Neighbor Algorithm (NNA), we developed a prediction system for protein SL in which we incorporated a protein functional domain profile. The overall accuracy achieved by this system is 93.96%. Furthermore, comparisons with other methods have been conducted to demonstrate the validity and efficiency of our prediction system. We also provide an implementation of our Subcellular Location Prediction System (SLPS), which is available at http://pcal.biosino.org.
确定蛋白质的亚细胞定位(SL)是功能蛋白质组学的主要任务之一。尽管在过去几十年里技术取得了令人瞩目的进步,但在高通量规模上通过实验确定SL仍然耗时费力。因此,计算预测是大规模蛋白质SL分配的首选方法,并且在适当情况下,随后进行实验研究。在本报告中,我们使用机器学习方法——最近邻算法(NNA),开发了一种蛋白质SL预测系统,其中纳入了蛋白质功能域概况。该系统实现的总体准确率为93.96%。此外,还与其他方法进行了比较,以证明我们预测系统的有效性和效率。我们还提供了亚细胞定位预测系统(SLPS)的实现,可在http://pcal.biosino.org获取。