• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过融合多个分类器预测蛋白质亚细胞定位。

Predicting protein subcellular location by fusing multiple classifiers.

作者信息

Chou Kuo-Chen, Shen Hong-Bin

机构信息

Gordon Life Science Institute, 13784 Torrey Del Mar, San Diego, California 92130, USA.

出版信息

J Cell Biochem. 2006 Oct 1;99(2):517-27. doi: 10.1002/jcb.20879.

DOI:10.1002/jcb.20879
PMID:16639720
Abstract

One of the fundamental goals in cell biology and proteomics is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. Knowledge of subcellular locations of proteins can provide key hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both time-consuming and expensive to determine the localization of an uncharacterized protein in a living cell purely based on experiments. With the avalanche of newly found protein sequences emerging in the post genomic era, we are facing a critical challenge, that is, how to develop an automated method to fast and reliably identify their subcellular locations so as to be able to timely use them for basic research and drug discovery. In view of this, an ensemble classifier was developed by the approach of fusing many basic individual classifiers through a voting system. Each of these basic classifiers was trained in a different dimension of the amphiphilic pseudo amino acid composition (Chou [2005] Bioinformatics 21: 10-19). As a demonstration, predictions were performed with the fusion classifier for proteins among the following 14 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cytoplasm, (5) cytoskeleton, (6) endoplasmic reticulum, (7) extracellular, (8) Golgi apparatus, (9) lysosome, (10) mitochondria, (11) nucleus, (12) peroxisome, (13) plasma membrane, and (14) vacuole. The overall success rates thus obtained via the resubstitution test, jackknife test, and independent dataset test were all significantly higher than those by the existing classifiers. It is anticipated that the novel ensemble classifier may also become a very useful vehicle in classifying other attributes of proteins according to their sequences, such as membrane protein type, enzyme family/sub-family, G-protein coupled receptor (GPCR) type, and structural class, among many others. The fusion ensemble classifier will be available at www.pami.sjtu.edu.cn/people/hbshen.

摘要

细胞生物学和蛋白质组学的一个基本目标是在细胞环境中确定蛋白质在各组分中的功能,这些组分将蛋白质组织起来。了解蛋白质的亚细胞定位可为揭示其功能以及理解它们在细胞网络中如何相互作用提供关键线索。不幸的是,单纯基于实验来确定活细胞中未知蛋白质的定位既耗时又昂贵。随着后基因组时代新发现的蛋白质序列大量涌现,我们面临着一个严峻挑战,即如何开发一种自动化方法来快速且可靠地识别它们的亚细胞定位,以便能够及时将其用于基础研究和药物研发。鉴于此,通过投票系统融合多个基本个体分类器的方法开发了一种集成分类器。这些基本分类器中的每一个都在两亲性伪氨基酸组成的不同维度上进行训练(周[2005]《生物信息学》21: 10 - 19)。作为演示,使用融合分类器对以下14种定位中的蛋白质进行了预测:(1) 细胞壁,(2) 中心粒,(3) 叶绿体,(4) 细胞质,(5) 细胞骨架,(6) 内质网,(7) 细胞外,(8) 高尔基体,(9) 溶酶体,(10) 线粒体,(11) 细胞核,(12) 过氧化物酶体,(13) 质膜,以及(14) 液泡。通过留一法检验、刀切法检验和独立数据集检验所获得的总体成功率均显著高于现有分类器。预计这种新型集成分类器在根据蛋白质序列对其其他属性进行分类时,如膜蛋白类型、酶家族/亚家族、G蛋白偶联受体(GPCR)类型和结构类别等,也可能成为一种非常有用的工具。融合集成分类器可在www.pami.sjtu.edu.cn/people/hbshen获取。

相似文献

1
Predicting protein subcellular location by fusing multiple classifiers.通过融合多个分类器预测蛋白质亚细胞定位。
J Cell Biochem. 2006 Oct 1;99(2):517-27. doi: 10.1002/jcb.20879.
2
Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers.通过融合优化的证据理论K近邻分类器预测真核生物蛋白质亚细胞定位
J Proteome Res. 2006 Aug;5(8):1888-97. doi: 10.1021/pr060167c.
3
Euk-PLoc: an ensemble classifier for large-scale eukaryotic protein subcellular location prediction.Euk-PLoc:一种用于大规模真核生物蛋白质亚细胞定位预测的集成分类器。
Amino Acids. 2007 Jul;33(1):57-67. doi: 10.1007/s00726-006-0478-8. Epub 2007 Jan 19.
4
Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.Hum-PLoc:一种用于预测人类蛋白质亚细胞定位的新型集成分类器。
Biochem Biophys Res Commun. 2006 Aug 18;347(1):150-7. doi: 10.1016/j.bbrc.2006.06.059. Epub 2006 Jun 21.
5
Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition.使用优化的证据理论K近邻分类器和伪氨基酸组成预测蛋白质亚核定位
Biochem Biophys Res Commun. 2005 Nov 25;337(3):752-6. doi: 10.1016/j.bbrc.2005.09.117. Epub 2005 Sep 28.
6
Prediction and classification of protein subcellular location-sequence-order effect and pseudo amino acid composition.蛋白质亚细胞定位的预测与分类——序列顺序效应和伪氨基酸组成
J Cell Biochem. 2003 Dec 15;90(6):1250-60. doi: 10.1002/jcb.10719.
7
Using ensemble classifier to identify membrane protein types.使用集成分类器识别膜蛋白类型。
Amino Acids. 2007;32(4):483-8. doi: 10.1007/s00726-006-0439-2. Epub 2006 Oct 12.
8
Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types.使用优化的证据理论K近邻分类器和伪氨基酸组成来预测膜蛋白类型。
Biochem Biophys Res Commun. 2005 Aug 19;334(1):288-92. doi: 10.1016/j.bbrc.2005.06.087.
9
Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins.Gpos-PLoc:一种用于预测革兰氏阳性细菌蛋白质亚细胞定位的集成分类器。
Protein Eng Des Sel. 2007 Jan;20(1):39-46. doi: 10.1093/protein/gzl053. Epub 2007 Jan 23.
10
Large-scale plant protein subcellular location prediction.大规模植物蛋白质亚细胞定位预测
J Cell Biochem. 2007 Feb 15;100(3):665-78. doi: 10.1002/jcb.21096.

引用本文的文献

1
Imbalanced classification for protein subcellular localization with multilabel oversampling.基于多标签过采样的蛋白质亚细胞定位不平衡分类。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac841.
2
iATC-mHyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals.iATC-mHyb:一种用于预测解剖学治疗化学物质分类的混合多标签分类器。
Oncotarget. 2017 Apr 11;8(35):58494-58503. doi: 10.18632/oncotarget.17028. eCollection 2017 Aug 29.
3
ProFold: Protein Fold Classification with Additional Structural Features and a Novel Ensemble Classifier.
ProFold:结合额外结构特征与新型集成分类器的蛋白质折叠分类
Biomed Res Int. 2016;2016:6802832. doi: 10.1155/2016/6802832. Epub 2016 Aug 28.
4
iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition.iROS-gPseKNC:通过将二核苷酸位置特异性倾向纳入通用伪核苷酸组成来预测DNA中的复制起始位点。
Oncotarget. 2016 Jun 7;7(23):34180-9. doi: 10.18632/oncotarget.9057.
5
acACS: improving the prediction accuracy of protein subcellular locations and protein classification by incorporating the average chemical shifts composition.acACS:通过纳入平均化学位移组成来提高蛋白质亚细胞定位和蛋白质分类的预测准确性。
ScientificWorldJournal. 2014;2014:864135. doi: 10.1155/2014/864135. Epub 2014 Jul 2.
6
A multilabel model based on Chou's pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types.基于 Chou 的伪氨基酸组成的多标签模型,用于识别具有单一和多种功能类型的膜蛋白。
J Membr Biol. 2013 Apr;246(4):327-34. doi: 10.1007/s00232-013-9536-9. Epub 2013 Apr 2.
7
Some remarks on protein attribute prediction and pseudo amino acid composition.关于蛋白质属性预测和伪氨基酸组成的一些说明。
J Theor Biol. 2011 Mar 21;273(1):236-47. doi: 10.1016/j.jtbi.2010.12.024. Epub 2010 Dec 17.
8
Multi label learning for prediction of human protein subcellular localizations.多标签学习在人类蛋白质亚细胞定位预测中的应用。
Protein J. 2009 Dec;28(9-10):384-90. doi: 10.1007/s10930-009-9205-0.
9
Automated classification of fMRI data employing trial-based imagery tasks.采用基于试验的成像任务对功能磁共振成像(fMRI)数据进行自动分类。
Med Image Anal. 2009 Jun;13(3):392-404. doi: 10.1016/j.media.2009.01.001. Epub 2009 Jan 16.
10
'Unite and conquer': enhanced prediction of protein subcellular localization by integrating multiple specialized tools.“团结起来攻克难关”:通过整合多种专业工具增强蛋白质亚细胞定位预测
BMC Bioinformatics. 2007 Oct 29;8:420. doi: 10.1186/1471-2105-8-420.