• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PairProSVM:基于局部两两轮廓比对和支持向量机的蛋白质亚细胞定位

PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM.

作者信息

Mak Man-Wai, Guo Jian, Kung Sun-Yuan

机构信息

Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):416-22. doi: 10.1109/TCBB.2007.70256.

DOI:10.1109/TCBB.2007.70256
PMID:18670044
Abstract

The subcellular locations of proteins are important functional annotations. An effective and reliable subcellular localization method is necessary for proteomics research. This paper introduces a new method---PairProSVM---to automatically predict the subcellular locations of proteins. The profiles of all protein sequences in the training set are constructed by PSI-BLAST and the pairwise profile-alignment scores are used to form feature vectors for training a support vector machine (SVM) classifier. It was found that PairProSVM outperforms the methods that are based on sequence alignment and amino-acid compositions even if most of the homologous sequences have been removed. This paper also demonstrates that the performance of PairProSVM is sensitive (and somewhat proportional) to the degree of its kernel matrix meeting the Mercer's condition. PairProSVM was evaluated on Reinhardt and Hubbard's, Huang and Li's, and Gardy et al.'s protein datasets. The overall accuracies on these three datasets reach 99.3\%, 76.5\%, and 91.9\%, respectively, which are higher than or comparable to those obtained by sequence alignment and by the methods compared in this paper.

摘要

蛋白质的亚细胞定位是重要的功能注释。对于蛋白质组学研究而言,一种有效且可靠的亚细胞定位方法是必不可少的。本文介绍了一种新方法——PairProSVM——用于自动预测蛋白质的亚细胞定位。通过PSI-BLAST构建训练集中所有蛋白质序列的谱,并使用成对谱比对分数来形成特征向量,以训练支持向量机(SVM)分类器。研究发现,即使去除了大部分同源序列,PairProSVM的性能仍优于基于序列比对和氨基酸组成的方法。本文还证明,PairProSVM的性能对其核矩阵满足Mercer条件的程度敏感(且在一定程度上成比例)。在Reinhardt和Hubbard、Huang和Li以及Gardy等人的蛋白质数据集上对PairProSVM进行了评估。在这三个数据集上的总体准确率分别达到99.3%、76.5%和91.9%,高于或与通过序列比对以及本文中所比较的方法所获得的准确率相当。

相似文献

1
PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM.PairProSVM:基于局部两两轮廓比对和支持向量机的蛋白质亚细胞定位
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):416-22. doi: 10.1109/TCBB.2007.70256.
2
TSSub: eukaryotic protein subcellular localization by extracting features from profiles.TSSub:通过从特征轮廓中提取特征来确定真核生物蛋白质的亚细胞定位
Bioinformatics. 2006 Jul 15;22(14):1784-5. doi: 10.1093/bioinformatics/btl180. Epub 2006 Jun 20.
3
Subcellular localization prediction with new protein encoding schemes.采用新蛋白质编码方案的亚细胞定位预测
IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):227-32. doi: 10.1109/TCBB.2007.070209.
4
Protein subcellular localization based on PSI-BLAST and machine learning.基于PSI-BLAST和机器学习的蛋白质亚细胞定位
J Bioinform Comput Biol. 2006 Dec;4(6):1181-95. doi: 10.1142/s0219720006002405.
5
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
6
Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.Hum-PLoc:一种用于预测人类蛋白质亚细胞定位的新型集成分类器。
Biochem Biophys Res Commun. 2006 Aug 18;347(1):150-7. doi: 10.1016/j.bbrc.2006.06.059. Epub 2006 Jun 21.
7
Application of latent semantic analysis to protein remote homology detection.潜在语义分析在蛋白质远程同源性检测中的应用。
Bioinformatics. 2006 Feb 1;22(3):285-90. doi: 10.1093/bioinformatics/bti801. Epub 2005 Nov 29.
8
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
9
Implicit motif distribution based hybrid computational kernel for sequence classification.基于隐式基序分布的混合计算内核用于序列分类。
Bioinformatics. 2005 Apr 15;21(8):1429-36. doi: 10.1093/bioinformatics/bti212. Epub 2004 Dec 14.
10
Efficient remote homology detection using local structure.利用局部结构进行高效的远程同源性检测。
Bioinformatics. 2003 Nov 22;19(17):2294-301. doi: 10.1093/bioinformatics/btg317.

引用本文的文献

1
A Comprehensive Review on RNA Subcellular Localization Prediction.RNA亚细胞定位预测综述
ArXiv. 2025 Apr 24:arXiv:2504.17162v1.
2
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
3
Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features.结合网络和功能特征预测人类蛋白质亚细胞定位
Front Genet. 2021 Nov 5;12:783128. doi: 10.3389/fgene.2021.783128. eCollection 2021.
4
Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization.用于多定位点蛋白质亚细胞定位的深度卷积神经网络的自进化框架。
Med Biol Eng Comput. 2020 Dec;58(12):3017-3038. doi: 10.1007/s11517-020-02275-w. Epub 2020 Oct 20.
5
Plant-mSubP: a computational framework for the prediction of single- and multi-target protein subcellular localization using integrated machine-learning approaches.植物微小肽:一种使用集成机器学习方法预测单靶点和多靶点蛋白质亚细胞定位的计算框架。
AoB Plants. 2019 Oct 17;12(3):plz068. doi: 10.1093/aobpla/plz068. eCollection 2020 Jun.
6
The effect of three novel feature extraction methods on the prediction of the subcellular localization of multi-site virus proteins.三种新型特征提取方法对多定位点病毒蛋白亚细胞定位预测的影响。
Bioengineered. 2018 Jan 1;9(1):196-202. doi: 10.1080/21655979.2017.1373536. Epub 2017 Nov 22.
7
Predicting Subcellular Localization of Apoptosis Proteins Combining GO Features of Homologous Proteins and Distance Weighted KNN Classifier.结合同源蛋白的GO特征和距离加权KNN分类器预测凋亡蛋白的亚细胞定位
Biomed Res Int. 2016;2016:1793272. doi: 10.1155/2016/1793272. Epub 2016 Apr 24.
8
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.用于预测和解释多标签蛋白质亚细胞定位的稀疏回归
BMC Bioinformatics. 2016 Feb 24;17:97. doi: 10.1186/s12859-016-0940-x.
9
Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.利用基因本体论和多标签分类器集成进行多地点革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位
BMC Bioinformatics. 2015;16 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-16-S12-S1. Epub 2015 Aug 25.
10
HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.HybridGO-Loc:在基因本体论上挖掘混合特征以预测多定位蛋白质的亚细胞定位。
PLoS One. 2014 Mar 19;9(3):e89545. doi: 10.1371/journal.pone.0089545. eCollection 2014.