• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HybridGO-Loc:在基因本体论上挖掘混合特征以预测多定位蛋白质的亚细胞定位。

HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

机构信息

Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China.

Department of Electrical Engineering, Princeton University, Princeton, New Jersey, United States of America.

出版信息

PLoS One. 2014 Mar 19;9(3):e89545. doi: 10.1371/journal.pone.0089545. eCollection 2014.

DOI:10.1371/journal.pone.0089545
PMID:24647341
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3960097/
Abstract

Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/.

摘要

蛋白质亚细胞定位预测作为阐明蛋白质体内功能和鉴定药物靶点的重要步骤,在过去几十年中得到了广泛的研究。近年来,研究的重点不仅在于预测单一标签蛋白质的亚细胞定位,还在于预测单一和多标签蛋白质的亚细胞定位。基于基因本体论 (GO) 的计算方法已被证明优于基于其他特征的方法。然而,现有的基于 GO 的方法主要关注 GO 术语的出现,而忽略了它们之间的关系。本文提出了一种多标签亚细胞定位预测器,即 HybridGO-Loc,它不仅利用了 GO 术语的出现情况,还利用了它们之间的关系。这是通过混合 GO 出现的频率和 GO 术语之间的语义相似性来实现的。给定一个蛋白质,通过使用 BLAST 搜索获得的同源蛋白质的访问号作为键,在基因本体数据库中搜索来检索一组 GO 术语。GO 出现的频率和 GO 术语之间的语义相似性 (SS) 分别用于构建频率向量和语义相似性向量,然后将它们混合以构建融合向量。提出了一种基于自适应决策的多标签支持向量机 (SVM) 分类器来对融合向量进行分类。基于最近的基准数据集和包含新型蛋白质的新数据集的实验结果表明,所提出的混合特征预测器明显优于基于单个 GO 特征的预测器以及其他最先进的预测器。为了方便读者,用于预测病毒或植物蛋白质的 HybridGO-Loc 服务器可在 http://bioinfo.eie.polyu.edu.hk/HybridGoServer/ 上在线访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/2d7c07d08bb4/pone.0089545.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/d972c2697676/pone.0089545.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/34e6993a63f4/pone.0089545.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/3c39ee63fc2a/pone.0089545.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/2d7c07d08bb4/pone.0089545.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/d972c2697676/pone.0089545.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/34e6993a63f4/pone.0089545.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/3c39ee63fc2a/pone.0089545.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c55/3960097/2d7c07d08bb4/pone.0089545.g004.jpg

相似文献

1
HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.HybridGO-Loc:在基因本体论上挖掘混合特征以预测多定位蛋白质的亚细胞定位。
PLoS One. 2014 Mar 19;9(3):e89545. doi: 10.1371/journal.pone.0089545. eCollection 2014.
2
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines.mGOASVM:基于基因本体和支持向量机的多标签蛋白质亚细胞定位。
BMC Bioinformatics. 2012 Nov 6;13:290. doi: 10.1186/1471-2105-13-290.
3
mPLR-Loc: an adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction.mPLR-Loc:一种基于惩罚逻辑回归的自适应决策多标签分类器,用于蛋白质亚细胞定位预测。
Anal Biochem. 2015 Mar 15;473:14-27. doi: 10.1016/j.ab.2014.10.014. Epub 2014 Oct 31.
4
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.用于预测和解释多标签蛋白质亚细胞定位的稀疏回归
BMC Bioinformatics. 2016 Feb 24;17:97. doi: 10.1186/s12859-016-0940-x.
5
mLASSO-Hum: A LASSO-based interpretable human-protein subcellular localization predictor.mLASSO-Hum:一种基于套索算法的可解释的人类蛋白质亚细胞定位预测器。
J Theor Biol. 2015 Oct 7;382:223-34. doi: 10.1016/j.jtbi.2015.06.042. Epub 2015 Jul 9.
6
Mem-ADSVM: A two-layer multi-label predictor for identifying multi-functional types of membrane proteins.Mem-ADSVM:一种用于识别多功能膜蛋白类型的双层多标签预测器。
J Theor Biol. 2016 Jun 7;398:32-42. doi: 10.1016/j.jtbi.2016.03.013. Epub 2016 Mar 19.
7
GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition.GOASVM:通过将词频基因本体论纳入 Chou 的通用伪氨基酸组成形式来预测亚细胞位置。
J Theor Biol. 2013 Apr 21;323:40-8. doi: 10.1016/j.jtbi.2013.01.012. Epub 2013 Jan 29.
8
ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.ProLoc-GO:利用信息丰富的基因本体术语进行基于序列的蛋白质亚细胞定位预测。
BMC Bioinformatics. 2008 Feb 1;9:80. doi: 10.1186/1471-2105-9-80.
9
Gene ontology based transfer learning for protein subcellular localization.基于基因本体论的蛋白质亚细胞定位迁移学习。
BMC Bioinformatics. 2011 Feb 2;12:44. doi: 10.1186/1471-2105-12-44.
10
Mem-mEN: Predicting Multi-Functional Types of Membrane Proteins by Interpretable Elastic Nets.Mem-mEN:通过可解释弹性网络预测膜蛋白的多功能类型
IEEE/ACM Trans Comput Biol Bioinform. 2016 Jul-Aug;13(4):706-18. doi: 10.1109/TCBB.2015.2474407. Epub 2015 Aug 28.

引用本文的文献

1
A Comprehensive Review on RNA Subcellular Localization Prediction.RNA亚细胞定位预测综述
ArXiv. 2025 Apr 24:arXiv:2504.17162v1.
2
Protein subcellular localization prediction tools.蛋白质亚细胞定位预测工具。
Comput Struct Biotechnol J. 2024 Apr 15;23:1796-1807. doi: 10.1016/j.csbj.2024.04.032. eCollection 2024 Dec.
3
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述

本文引用的文献

1
PSI: a comprehensive and integrative approach for accurate plant subcellular localization prediction.PSI:一种用于准确预测植物亚细胞定位的全面综合方法。
PLoS One. 2013 Oct 23;8(10):e75826. doi: 10.1371/journal.pone.0075826. eCollection 2013.
2
SCLpredT: Ab initio and homology-based prediction of subcellular localization by N-to-1 neural networks.SCLpredT:通过N到1神经网络进行亚细胞定位的从头预测和基于同源性的预测。
Springerplus. 2013 Oct 3;2:502. doi: 10.1186/2193-1801-2-502. eCollection 2013.
3
Some remarks on predicting multi-label attributes in molecular biosystems.
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
4
Prediction of unconventional protein secretion by exosomes.外泌体中非传统蛋白分泌的预测。
BMC Bioinformatics. 2021 Jun 16;22(1):333. doi: 10.1186/s12859-021-04219-z.
5
Ensemble of Multiple Classifiers for Multilabel Classification of Plant Protein Subcellular Localization.用于植物蛋白质亚细胞定位多标签分类的多个分类器集成
Life (Basel). 2021 Mar 30;11(4):293. doi: 10.3390/life11040293.
6
Digging for Stress-Responsive Cell Wall Proteins for Developing Stress-Resistant Maize.挖掘应激反应细胞壁蛋白以培育抗逆玉米
Front Plant Sci. 2020 Sep 25;11:576385. doi: 10.3389/fpls.2020.576385. eCollection 2020.
7
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
8
Plant-mSubP: a computational framework for the prediction of single- and multi-target protein subcellular localization using integrated machine-learning approaches.植物微小肽:一种使用集成机器学习方法预测单靶点和多靶点蛋白质亚细胞定位的计算框架。
AoB Plants. 2019 Oct 17;12(3):plz068. doi: 10.1093/aobpla/plz068. eCollection 2020 Jun.
9
Predicting Endoplasmic Reticulum Resident Proteins Using Auto-Cross Covariance Transformation With a U-Shaped Residue Weight-Transfer Function.使用具有U形残基权重转移函数的自交叉协方差变换预测内质网驻留蛋白
Front Genet. 2019 Dec 20;10:1231. doi: 10.3389/fgene.2019.01231. eCollection 2019.
10
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
关于预测分子生物系统中多标签属性的一些评论。
Mol Biosyst. 2013 Jun;9(6):1092-100. doi: 10.1039/c3mb25555g. Epub 2013 Mar 28.
4
GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition.GOASVM:通过将词频基因本体论纳入 Chou 的通用伪氨基酸组成形式来预测亚细胞位置。
J Theor Biol. 2013 Apr 21;323:40-8. doi: 10.1016/j.jtbi.2013.01.012. Epub 2013 Jan 29.
5
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines.mGOASVM:基于基因本体和支持向量机的多标签蛋白质亚细胞定位。
BMC Bioinformatics. 2012 Nov 6;13:290. doi: 10.1186/1471-2105-13-290.
6
Multi-label multi-kernel transfer learning for human protein subcellular localization.多标签多内核迁移学习在人类蛋白质亚细胞定位中的应用。
PLoS One. 2012;7(6):e37716. doi: 10.1371/journal.pone.0037716. Epub 2012 Jun 13.
7
Imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites.用于具有单一位点和多个位点的人类蛋白质亚细胞定位预测的不平衡多模态多标签学习
PLoS One. 2012;7(6):e37155. doi: 10.1371/journal.pone.0037155. Epub 2012 Jun 8.
8
A multi-label predictor for identifying the subcellular locations of singleplex and multiplex eukaryotic proteins.一种用于识别单plex 和 multiplex 真核蛋白质亚细胞位置的多标签预测器。
PLoS One. 2012;7(5):e36317. doi: 10.1371/journal.pone.0036317. Epub 2012 May 22.
9
An ensemble classifier for eukaryotic protein subcellular location prediction using gene ontology categories and amino acid hydrophobicity.基于基因本体论类别和氨基酸疏水性的真核蛋白质亚细胞定位预测的集成分类器。
PLoS One. 2012;7(1):e31057. doi: 10.1371/journal.pone.0031057. Epub 2012 Jan 30.
10
WegoLoc: accurate prediction of protein subcellular localization using weighted Gene Ontology terms.WegoLoc:使用加权基因本体论术语准确预测蛋白质亚细胞定位
Bioinformatics. 2012 Apr 1;28(7):1028-30. doi: 10.1093/bioinformatics/bts062. Epub 2012 Jan 31.