• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用基因本体论和多标签分类器集成进行多地点革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位

Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.

作者信息

Wang Xiao, Zhang Jun, Li Guo-Zheng

出版信息

BMC Bioinformatics. 2015;16 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-16-S12-S1. Epub 2015 Aug 25.

DOI:10.1186/1471-2105-16-S12-S1
PMID:26329681
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4705491/
Abstract

BACKGROUND

It has become a very important and full of challenge task to predict bacterial protein subcellular locations using computational methods. Although there exist a lot of prediction methods for bacterial proteins, the majority of these methods can only deal with single-location proteins. But unfortunately many multi-location proteins are located in the bacterial cells. Moreover, multi-location proteins have special biological functions capable of helping the development of new drugs. So it is necessary to develop new computational methods for accurately predicting subcellular locations of multi-location bacterial proteins.

RESULTS

In this article, two efficient multi-label predictors, Gpos-ECC-mPLoc and Gneg-ECC-mPLoc, are developed to predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. The two multi-label predictors construct the GO vectors by using the GO terms of homologous proteins of query proteins and then adopt a powerful multi-label ensemble classifier to make the final multi-label prediction. The two multi-label predictors have the following advantages: (1) they improve the prediction performance of multi-label proteins by taking the correlations among different labels into account; (2) they ensemble multiple CC classifiers and further generate better prediction results by ensemble learning; and (3) they construct the GO vectors by using the frequency of occurrences of GO terms in the typical homologous set instead of using 0/1 values. Experimental results show that Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively.

CONCLUSIONS

Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently improve prediction accuracy of subcellular localization of multi-location gram-positive and gram-negative bacterial proteins respectively. The online web servers for Gpos-ECC-mPLoc and Gneg-ECC-mPLoc predictors are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/gpos-ecc-mploc/ and http://biomed.zzuli.edu.cn/bioinfo/gneg-ecc-mploc/ respectively.

摘要

背景

利用计算方法预测细菌蛋白质的亚细胞定位已成为一项非常重要且充满挑战的任务。尽管存在许多针对细菌蛋白质的预测方法,但这些方法中的大多数只能处理单定位蛋白质。但不幸的是,许多多定位蛋白质存在于细菌细胞中。此外,多定位蛋白质具有特殊的生物学功能,有助于新药的开发。因此,有必要开发新的计算方法来准确预测多定位细菌蛋白质的亚细胞定位。

结果

在本文中,开发了两种高效的多标签预测器Gpos-ECC-mPLoc和Gneg-ECC-mPLoc,分别用于预测多标签革兰氏阳性和革兰氏阴性细菌蛋白质的亚细胞定位。这两种多标签预测器通过使用查询蛋白质同源蛋白质的GO术语构建GO向量,然后采用强大的多标签集成分类器进行最终的多标签预测。这两种多标签预测器具有以下优点:(1)通过考虑不同标签之间的相关性提高了多标签蛋白质的预测性能;(2)集成了多个CC分类器,并通过集成学习进一步产生更好的预测结果;(3)通过使用典型同源集中GO术语的出现频率构建GO向量,而不是使用0/1值。实验结果表明,Gpos-ECC-mPLoc和Gneg-ECC-mPLoc可以分别有效地预测多标签革兰氏阳性和革兰氏阴性细菌蛋白质的亚细胞定位。

结论

Gpos-ECC-mPLoc和Gneg-ECC-mPLoc可以分别有效地提高多定位革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位的预测准确性。Gpos-ECC-mPLoc和Gneg-ECC-mPLoc预测器的在线网络服务器分别可从http://biomed.zzuli.edu.cn/bioinfo/gpos-ecc-mploc/和http://biomed.zzuli.edu.cn/bioinfo/gneg-ecc-mploc/免费访问。

相似文献

1
Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.利用基因本体论和多标签分类器集成进行多地点革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位
BMC Bioinformatics. 2015;16 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-16-S12-S1. Epub 2015 Aug 25.
2
A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites.一种用于预测具有单一位点和多个位点的革兰氏阴性细菌蛋白亚细胞定位的多标签分类器。
PLoS One. 2011;6(6):e20592. doi: 10.1371/journal.pone.0020592. Epub 2011 Jun 17.
3
Gneg-mPLoc: a top-down strategy to enhance the quality of predicting subcellular localization of Gram-negative bacterial proteins.Gneg-mPLoc:一种提升革兰氏阴性细菌蛋白亚细胞定位预测质量的自顶向下策略。
J Theor Biol. 2010 May 21;264(2):326-33. doi: 10.1016/j.jtbi.2010.01.018. Epub 2010 Jan 20.
4
Gpos-mPLoc: a top-down approach to improve the quality of predicting subcellular localization of Gram-positive bacterial proteins.Gpos-mPLoc:一种自上而下的方法,用于提高革兰氏阳性细菌蛋白质亚细胞定位预测的质量。
Protein Pept Lett. 2009;16(12):1478-84. doi: 10.2174/092986609789839322.
5
iLoc-Gpos: a multi-layer classifier for predicting the subcellular localization of singleplex and multiplex Gram-positive bacterial proteins.iLoc-Gpos:一种用于预测单重和多重革兰氏阳性细菌蛋白质亚细胞定位的多层分类器。
Protein Pept Lett. 2012 Jan;19(1):4-14. doi: 10.2174/092986612798472839.
6
Virus-ECC-mPLoc: a multi-label predictor for predicting the subcellular localization of virus proteins with both single and multiple sites based on a general form of Chou's pseudo amino acid composition.病毒-ECC-mPLoc:一种基于周氏伪氨基酸组成的通用形式,用于预测具有单一位点和多个位点的病毒蛋白亚细胞定位的多标签预测器。
Protein Pept Lett. 2013 Mar;20(3):309-17. doi: 10.2174/0929866511320030009.
7
A multi-label predictor for identifying the subcellular locations of singleplex and multiplex eukaryotic proteins.一种用于识别单plex 和 multiplex 真核蛋白质亚细胞位置的多标签预测器。
PLoS One. 2012;7(5):e36317. doi: 10.1371/journal.pone.0036317. Epub 2012 May 22.
8
Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins.Gpos-PLoc:一种用于预测革兰氏阳性细菌蛋白质亚细胞定位的集成分类器。
Protein Eng Des Sel. 2007 Jan;20(1):39-46. doi: 10.1093/protein/gzl053. Epub 2007 Jan 23.
9
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
10
MSLoc-DT: a new method for predicting the protein subcellular location of multispecies based on decision templates.MSLoc-DT:一种基于决策模板预测多物种蛋白质亚细胞位置的新方法。
Anal Biochem. 2014 Mar 15;449:164-71. doi: 10.1016/j.ab.2013.12.013. Epub 2013 Dec 21.

引用本文的文献

1
Practical Applications of Language Models in Protein Sorting Prediction: SignalP 6.0, DeepLoc 2.1, and DeepLocPro 1.0.语言模型在蛋白质分选预测中的实际应用:SignalP 6.0、DeepLoc 2.1和DeepLocPro 1.0
Methods Mol Biol. 2025;2941:153-175. doi: 10.1007/978-1-0716-4623-6_10.
2
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
3
Three Distinct Proteases Are Responsible for Overall Cell Surface Proteolysis in Streptococcus thermophilus.三种不同的蛋白酶负责嗜热链球菌的整体细胞表面蛋白水解。
Appl Environ Microbiol. 2021 Nov 10;87(23):e0129221. doi: 10.1128/AEM.01292-21. Epub 2021 Sep 22.
4
Tools for the Recognition of Sorting Signals and the Prediction of Subcellular Localization of Proteins From Their Amino Acid Sequences.用于识别分选信号以及根据氨基酸序列预测蛋白质亚细胞定位的工具。
Front Genet. 2020 Nov 25;11:607812. doi: 10.3389/fgene.2020.607812. eCollection 2020.
5
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
6
Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA.基于进化信息和 LDA 的两种新特征提取方法对凋亡蛋白的亚细胞定位预测
BMC Bioinformatics. 2020 May 24;21(1):212. doi: 10.1186/s12859-020-3539-1.
7
PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data.PSORTm:一种用于宏基因组数据的细菌和古菌蛋白质亚细胞定位预测工具。
Bioinformatics. 2020 May 1;36(10):3043-3048. doi: 10.1093/bioinformatics/btaa136.

本文引用的文献

1
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines.mGOASVM:基于基因本体和支持向量机的多标签蛋白质亚细胞定位。
BMC Bioinformatics. 2012 Nov 6;13:290. doi: 10.1186/1471-2105-13-290.
2
A multi-label predictor for identifying the subcellular locations of singleplex and multiplex eukaryotic proteins.一种用于识别单plex 和 multiplex 真核蛋白质亚细胞位置的多标签预测器。
PLoS One. 2012;7(5):e36317. doi: 10.1371/journal.pone.0036317. Epub 2012 May 22.
3
Virus-ECC-mPLoc: a multi-label predictor for predicting the subcellular localization of virus proteins with both single and multiple sites based on a general form of Chou's pseudo amino acid composition.病毒-ECC-mPLoc:一种基于周氏伪氨基酸组成的通用形式,用于预测具有单一位点和多个位点的病毒蛋白亚细胞定位的多标签预测器。
Protein Pept Lett. 2013 Mar;20(3):309-17. doi: 10.2174/0929866511320030009.
4
iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites.iLoc-Hum:利用累积标签量表预测具有单一位点和多个位点的人类蛋白质的亚细胞定位。
Mol Biosyst. 2012 Feb;8(2):629-41. doi: 10.1039/c1mb05420a. Epub 2011 Dec 1.
5
iDNA-Prot: identification of DNA binding proteins using random forest with grey model.iDNA-Prot:基于随机森林和灰色模型识别 DNA 结合蛋白。
PLoS One. 2011;6(9):e24756. doi: 10.1371/journal.pone.0024756. Epub 2011 Sep 15.
6
iLoc-Gpos: a multi-layer classifier for predicting the subcellular localization of singleplex and multiplex Gram-positive bacterial proteins.iLoc-Gpos:一种用于预测单重和多重革兰氏阳性细菌蛋白质亚细胞定位的多层分类器。
Protein Pept Lett. 2012 Jan;19(1):4-14. doi: 10.2174/092986612798472839.
7
NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features.NR-2L:一种基于序列衍生特征识别核受体亚家族的两级预测器。
PLoS One. 2011;6(8):e23505. doi: 10.1371/journal.pone.0023505. Epub 2011 Aug 15.
8
A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites.一种用于预测具有单一位点和多个位点的革兰氏阴性细菌蛋白亚细胞定位的多标签分类器。
PLoS One. 2011;6(6):e20592. doi: 10.1371/journal.pone.0020592. Epub 2011 Jun 17.
9
Prediction of GABAA receptor proteins using the concept of Chou's pseudo-amino acid composition and support vector machine.基于 Chou 的伪氨基酸组成和支持向量机预测 GABAA 受体蛋白。
J Theor Biol. 2011 Jul 21;281(1):18-23. doi: 10.1016/j.jtbi.2011.04.017. Epub 2011 Apr 28.
10
iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins.iLoc-Euk:一种用于预测单plex 和 multiplex 真核蛋白质亚细胞定位的多标签分类器。
PLoS One. 2011 Mar 30;6(3):e18258. doi: 10.1371/journal.pone.0018258.