• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用氨基酸组成和氨基酸对,通过支持向量机预测蛋白质亚细胞定位。

Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs.

作者信息

Park Keun-Joon, Kanehisa Minoru

机构信息

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan.

出版信息

Bioinformatics. 2003 Sep 1;19(13):1656-63. doi: 10.1093/bioinformatics/btg222.

DOI:10.1093/bioinformatics/btg222
PMID:12967962
Abstract

MOTIVATION

The subcellular location of a protein is closely correlated to its function. Thus, computational prediction of subcellular locations from the amino acid sequence information would help annotation and functional prediction of protein coding genes in complete genomes. We have developed a method based on support vector machines (SVMs).

RESULTS

We considered 12 subcellular locations in eukaryotic cells: chloroplast, cytoplasm, cytoskeleton, endoplasmic reticulum, extracellular medium, Golgi apparatus, lysosome, mitochondrion, nucleus, peroxisome, plasma membrane, and vacuole. We constructed a data set of proteins with known locations from the SWISS-PROT database. A set of SVMs was trained to predict the subcellular location of a given protein based on its amino acid, amino acid pair, and gapped amino acid pair compositions. The predictors based on these different compositions were then combined using a voting scheme. Results obtained through 5-fold cross-validation tests showed an improvement in prediction accuracy over the algorithm based on the amino acid composition only. This prediction method is available via the Internet.

摘要

动机

蛋白质的亚细胞定位与其功能密切相关。因此,根据氨基酸序列信息对亚细胞定位进行计算预测将有助于对完整基因组中蛋白质编码基因进行注释和功能预测。我们开发了一种基于支持向量机(SVM)的方法。

结果

我们考虑了真核细胞中的12个亚细胞定位:叶绿体、细胞质、细胞骨架、内质网、细胞外介质、高尔基体、溶酶体、线粒体、细胞核、过氧化物酶体、质膜和液泡。我们从SWISS-PROT数据库构建了一个具有已知定位的蛋白质数据集。训练了一组支持向量机,以根据给定蛋白质的氨基酸、氨基酸对和带间隔的氨基酸对组成来预测其亚细胞定位。然后使用投票方案将基于这些不同组成的预测器进行组合。通过5折交叉验证测试获得的结果表明,与仅基于氨基酸组成的算法相比,预测准确性有所提高。这种预测方法可通过互联网获得。

相似文献

1
Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs.利用氨基酸组成和氨基酸对,通过支持向量机预测蛋白质亚细胞定位。
Bioinformatics. 2003 Sep 1;19(13):1656-63. doi: 10.1093/bioinformatics/btg222.
2
Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.Hum-PLoc:一种用于预测人类蛋白质亚细胞定位的新型集成分类器。
Biochem Biophys Res Commun. 2006 Aug 18;347(1):150-7. doi: 10.1016/j.bbrc.2006.06.059. Epub 2006 Jun 21.
3
Prediction of protein subcellular locations using fuzzy k-NN method.使用模糊k近邻法预测蛋白质亚细胞定位。
Bioinformatics. 2004 Jan 1;20(1):21-8. doi: 10.1093/bioinformatics/btg366.
4
Prediction of protein subcellular locations by GO-FunD-PseAA predictor.使用GO-FunD-PseAA预测器预测蛋白质亚细胞定位
Biochem Biophys Res Commun. 2004 Aug 6;320(4):1236-9. doi: 10.1016/j.bbrc.2004.06.073.
5
Subcellular localization prediction with new protein encoding schemes.采用新蛋白质编码方案的亚细胞定位预测
IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):227-32. doi: 10.1109/TCBB.2007.070209.
6
pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties.pSLIP:基于支持向量机并利用多种物理化学性质进行蛋白质亚细胞定位预测
BMC Bioinformatics. 2005 Jun 17;6:152. doi: 10.1186/1471-2105-6-152.
7
Bio-support vector machines for computational proteomics.用于计算蛋白质组学的生物支持向量机
Bioinformatics. 2004 Mar 22;20(5):735-41. doi: 10.1093/bioinformatics/btg477. Epub 2004 Jan 29.
8
Supervised learning method for the prediction of subcellular localization of proteins using amino acid and amino acid pair composition.使用氨基酸和氨基酸对组成预测蛋白质亚细胞定位的监督学习方法。
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S16. doi: 10.1186/1471-2164-9-S1-S16.
9
PairProSVM: protein subcellular localization based on local pairwise profile alignment and SVM.PairProSVM:基于局部两两轮廓比对和支持向量机的蛋白质亚细胞定位
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):416-22. doi: 10.1109/TCBB.2007.70256.
10
Artificial neural network model for predicting protein subcellular location.用于预测蛋白质亚细胞定位的人工神经网络模型。
Comput Chem. 2002 Jan;26(2):179-82. doi: 10.1016/s0097-8485(01)00106-1.

引用本文的文献

1
Integrating In Silico and In Vitro Approaches to Identify Natural Peptides with Selective Cytotoxicity against Cancer Cells.整合计算机模拟和体外方法,以鉴定对癌细胞具有选择性细胞毒性的天然肽。
Int J Mol Sci. 2024 Jun 21;25(13):6848. doi: 10.3390/ijms25136848.
2
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
3
Enhanced annotation of CD45RA to distinguish T cell subsets in single-cell RNA-seq via machine learning.
通过机器学习增强CD45RA注释以区分单细胞RNA测序中的T细胞亚群。
Bioinform Adv. 2023 Nov 6;3(1):vbad159. doi: 10.1093/bioadv/vbad159. eCollection 2023.
4
Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence.基于免疫组化图像和蛋白质序列的双信号特征空间映射蛋白质亚细胞定位。
Sensors (Basel). 2023 Nov 7;23(22):9014. doi: 10.3390/s23229014.
5
FungiProteomeDB: a database for the molecular weight and isoelectric points of the fungal proteomes.真菌蛋白质组数据库:一个用于真菌蛋白质组分子量和等电点的数据库。
Database (Oxford). 2023 Mar 16;2023. doi: 10.1093/database/baad004.
6
PlantMWpIDB: a database for the molecular weight and isoelectric points of the plant proteomes.植物 MWpIDB:植物蛋白质组分子量和等电点的数据库。
Sci Rep. 2022 May 6;12(1):7421. doi: 10.1038/s41598-022-11077-z.
7
A Web Server for GPCR-GPCR Interaction Pair Prediction.一个用于 GPCR-GPCR 相互作用对预测的网络服务器。
Front Endocrinol (Lausanne). 2022 Mar 24;13:825195. doi: 10.3389/fendo.2022.825195. eCollection 2022.
8
Web tools to perform long non-coding RNAs analysis in oncology research.网络工具在肿瘤研究中进行长非编码 RNA 分析。
Database (Oxford). 2021 Jul 23;2021. doi: 10.1093/database/baab047.
9
Identification of subtypes of anticancer peptides based on sequential features and physicochemical properties.基于序贯特征和理化性质鉴定抗癌肽的亚型。
Sci Rep. 2021 Jun 30;11(1):13594. doi: 10.1038/s41598-021-93124-9.
10
iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.iDPGK:基于序列特征的赖氨酸磷酸甘油化位点的表征和鉴定。
BMC Bioinformatics. 2020 Dec 9;21(1):568. doi: 10.1186/s12859-020-03916-5.