• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用基因本体论评估蛋白质相似性及其在核亚定位预测中的应用。

Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction.

作者信息

Lei Zhengdeng, Dai Yang

机构信息

Department of Bioengineering (MC063), University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607, USA.

出版信息

BMC Bioinformatics. 2006 Nov 7;7:491. doi: 10.1186/1471-2105-7-491.

DOI:10.1186/1471-2105-7-491
PMID:17090318
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1660555/
Abstract

BACKGROUND

The accomplishment of the various genome sequencing projects resulted in accumulation of massive amount of gene sequence information. This calls for a large-scale computational method for predicting protein localization from sequence. The protein localization can provide valuable information about its molecular function, as well as the biological pathway in which it participates. The prediction of localization of a protein at subnuclear level is a challenging task. In our previous work we proposed an SVM-based system using protein sequence information for this prediction task. In this work, we assess protein similarity with Gene Ontology (GO) and then improve the performance of the system by adding a module of nearest neighbor classifier using a similarity measure derived from the GO annotation terms for protein sequences.

RESULTS

The performance of the new system proposed here was compared with our previous system using a set of proteins resided within 6 localizations collected from the Nuclear Protein Database (NPD). The overall MCC (accuracy) is elevated from 0.284 (50.0%) to 0.519 (66.5%) for single-localization proteins in leave-one-out cross-validation; and from 0.420 (65.2%) to 0.541 (65.2%) for an independent set of multi-localization proteins. The new system is available at http://array.bioengr.uic.edu/subnuclear.htm.

CONCLUSION

The prediction of protein subnuclear localizations can be largely influenced by various definitions of similarity for a pair of proteins based on different similarity measures of GO terms. Using the sum of similarity scores over the matched GO term pairs for two proteins as the similarity definition produced the best predictive outcome. Substantial improvement in predicting protein subnuclear localizations has been achieved by combining Gene Ontology with sequence information.

摘要

背景

各种基因组测序项目的完成导致了大量基因序列信息的积累。这就需要一种大规模的计算方法来从序列预测蛋白质定位。蛋白质定位可以提供有关其分子功能以及它所参与的生物途径的有价值信息。预测蛋白质在亚核水平的定位是一项具有挑战性的任务。在我们之前的工作中,我们提出了一个基于支持向量机的系统,使用蛋白质序列信息来完成这个预测任务。在这项工作中,我们通过基因本体论(GO)评估蛋白质相似性,然后通过添加一个最近邻分类器模块来提高系统性能,该模块使用从蛋白质序列的GO注释术语派生的相似性度量。

结果

使用从核蛋白数据库(NPD)收集的6个定位内的一组蛋白质,将这里提出的新系统的性能与我们之前的系统进行了比较。在留一法交叉验证中,单定位蛋白质的总体马修斯相关系数(准确率)从0.284(50.0%)提高到0.519(66.5%);对于一组独立的多定位蛋白质,从0.420(65.2%)提高到0.541(65.2%)。新系统可在http://array.bioengr.uic.edu/subnuclear.htm获得。

结论

基于GO术语的不同相似性度量,一对蛋白质的各种相似性定义在很大程度上会影响蛋白质亚核定位的预测。使用两个蛋白质匹配的GO术语对的相似性得分之和作为相似性定义产生了最佳预测结果。通过将基因本体论与序列信息相结合,在预测蛋白质亚核定位方面取得了显著改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a45/1660555/63ea89703b1c/1471-2105-7-491-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a45/1660555/63ea89703b1c/1471-2105-7-491-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a45/1660555/63ea89703b1c/1471-2105-7-491-1.jpg

相似文献

1
Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction.利用基因本体论评估蛋白质相似性及其在核亚定位预测中的应用。
BMC Bioinformatics. 2006 Nov 7;7:491. doi: 10.1186/1471-2105-7-491.
2
An SVM-based system for predicting protein subnuclear localizations.一种基于支持向量机的蛋白质亚核定位预测系统。
BMC Bioinformatics. 2005 Dec 7;6:291. doi: 10.1186/1471-2105-6-291.
3
Predicting protein subnuclear localization using GO-amino-acid composition features.利用基因本体论-氨基酸组成特征预测蛋白质亚核定位
Biosystems. 2009 Nov;98(2):73-9. doi: 10.1016/j.biosystems.2009.06.007. Epub 2009 Jul 5.
4
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.
5
AVID: an integrative framework for discovering functional relationships among proteins.AVID:一个用于发现蛋白质间功能关系的综合框架。
BMC Bioinformatics. 2005 Jun 1;6:136. doi: 10.1186/1471-2105-6-136.
6
Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization.Hum-PLoc:一种用于预测人类蛋白质亚细胞定位的新型集成分类器。
Biochem Biophys Res Commun. 2006 Aug 18;347(1):150-7. doi: 10.1016/j.bbrc.2006.06.059. Epub 2006 Jun 21.
7
The use of gene ontology evidence codes in preventing classifier assessment bias.基因本体证据代码在防止分类器评估偏差中的应用。
Bioinformatics. 2009 May 1;25(9):1173-7. doi: 10.1093/bioinformatics/btp122. Epub 2009 Mar 2.
8
ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features.ProLoc:利用支持向量机并从物理化学组成特征中自动选择来预测蛋白质亚核定位。
Biosystems. 2007 Sep-Oct;90(2):573-81. doi: 10.1016/j.biosystems.2007.01.001. Epub 2007 Jan 4.
9
PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.PFP:利用蛋白质序列数据自动预测具有置信度分数的基因本体功能注释。
Proteins. 2009 Feb 15;74(3):566-82. doi: 10.1002/prot.22172.
10
Protein classification based on text document classification techniques.基于文本文档分类技术的蛋白质分类。
Proteins. 2005 Mar 1;58(4):955-70. doi: 10.1002/prot.20373.

引用本文的文献

1
Evolving knowledge graph similarity for supervised learning in complex biomedical domains.用于复杂生物医学领域中监督学习的进化知识图相似度。
BMC Bioinformatics. 2020 Jan 3;21(1):6. doi: 10.1186/s12859-019-3296-1.
2
Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks.利用蛋白质相互作用网络预测蛋白质亚线粒体定位
Iran J Biotechnol. 2018 Aug 11;16(3):e1933. doi: 10.15171/ijb.1933. eCollection 2018 Aug.
3
An improved method for functional similarity analysis of genes based on Gene Ontology.一种基于基因本体论的基因功能相似性分析的改进方法。

本文引用的文献

1
Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations.酵母蛋白质-蛋白质相互作用网络的预测:来自基因本体论和注释的见解。
Nucleic Acids Res. 2006 Apr 26;34(7):2137-50. doi: 10.1093/nar/gkl219. Print 2006.
2
Gene functional similarity search tool (GFSST).基因功能相似性搜索工具(GFSST)。
BMC Bioinformatics. 2006 Mar 14;7:135. doi: 10.1186/1471-2105-7-135.
3
An SVM-based system for predicting protein subnuclear localizations.一种基于支持向量机的蛋白质亚核定位预测系统。
BMC Syst Biol. 2016 Dec 23;10(Suppl 4):119. doi: 10.1186/s12918-016-0359-z.
4
Tocotrienols induce endoplasmic reticulum stress and apoptosis in cervical cancer cells.生育三烯酚可诱导宫颈癌细胞发生内质网应激和凋亡。
Genes Nutr. 2016 Dec 23;11:32. doi: 10.1186/s12263-016-0543-1. eCollection 2016.
5
FARNA: knowledgebase of inferred functions of non-coding RNA transcripts.FARNA:非编码RNA转录本推断功能知识库。
Nucleic Acids Res. 2017 Mar 17;45(5):2838-2848. doi: 10.1093/nar/gkw973.
6
A weighted multipath measurement based on gene ontology for estimating gene products similarity.一种基于基因本体论的加权多路径测量方法,用于估计基因产物的相似性。
J Comput Biol. 2014 Dec;21(12):964-74. doi: 10.1089/cmb.2014.0143.
7
HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.HybridGO-Loc:在基因本体论上挖掘混合特征以预测多定位蛋白质的亚细胞定位。
PLoS One. 2014 Mar 19;9(3):e89545. doi: 10.1371/journal.pone.0089545. eCollection 2014.
8
Screening and identification of resistance related proteins from apple leaves inoculated with Marssonina coronaria (EII. & J. J. Davis).从接种梨孢菌(EII. & J. J. Davis)的苹果叶片中筛选和鉴定抗性相关蛋白。
Proteome Sci. 2014 Feb 7;12(1):7. doi: 10.1186/1477-5956-12-7.
9
Overexpressed TPX2 causes ectopic formation of microtubular arrays in the nuclei of acentrosomal plant cells.过表达的 TPX2 导致无中心体植物细胞核内微管阵列的异位形成。
J Exp Bot. 2013 Nov;64(14):4575-87. doi: 10.1093/jxb/ert271. Epub 2013 Sep 4.
10
An ensemble method for predicting subnuclear localizations from primary protein structures.一种基于原始蛋白质结构预测亚核定位的集成方法。
PLoS One. 2013;8(2):e57225. doi: 10.1371/journal.pone.0057225. Epub 2013 Feb 27.
BMC Bioinformatics. 2005 Dec 7;6:291. doi: 10.1186/1471-2105-6-291.
4
Domain rearrangements in protein evolution.蛋白质进化中的结构域重排
J Mol Biol. 2005 Nov 4;353(4):911-23. doi: 10.1016/j.jmb.2005.08.067. Epub 2005 Sep 21.
5
Protein subcellular localization prediction for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines.利用氨基酸子字母表和多个支持向量机组合对革兰氏阴性菌进行蛋白质亚细胞定位预测
BMC Bioinformatics. 2005 Jul 13;6:174. doi: 10.1186/1471-2105-6-174.
6
pSLIP: SVM based protein subcellular localization prediction using multiple physicochemical properties.pSLIP:基于支持向量机并利用多种物理化学性质进行蛋白质亚细胞定位预测
BMC Bioinformatics. 2005 Jun 17;6:152. doi: 10.1186/1471-2105-6-152.
7
Prediction of functional modules based on comparative genome analysis and Gene Ontology application.基于比较基因组分析和基因本体应用的功能模块预测
Nucleic Acids Res. 2005 May 18;33(9):2822-37. doi: 10.1093/nar/gki573. Print 2005.
8
Mimicking cellular sorting improves prediction of subcellular localization.模仿细胞分选可提高亚细胞定位的预测能力。
J Mol Biol. 2005 Apr 22;348(1):85-100. doi: 10.1016/j.jmb.2005.02.025.
9
Nuclear localization is required for Dishevelled function in Wnt/beta-catenin signaling.细胞核定位是Wnt/β-连环蛋白信号通路中Dishevelled发挥功能所必需的。
J Biol. 2005;4(1):3. doi: 10.1186/jbiol20. Epub 2005 Feb 15.
10
PSLpred: prediction of subcellular localization of bacterial proteins.PSLpred:细菌蛋白质亚细胞定位预测
Bioinformatics. 2005 May 15;21(10):2522-4. doi: 10.1093/bioinformatics/bti309. Epub 2005 Feb 4.