• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HashGO:用于蛋白质功能预测的基因本体哈希法

HashGO: hashing gene ontology for protein function prediction.

作者信息

Yu Guoxian, Zhao Yingwen, Lu Chang, Wang Jun

机构信息

College of Computer and Information Science, Southwest University, Chongqing 400715, China.

College of Computer and Information Science, Southwest University, Chongqing 400715, China.

出版信息

Comput Biol Chem. 2017 Dec;71:264-273. doi: 10.1016/j.compbiolchem.2017.09.010. Epub 2017 Oct 4.

DOI:10.1016/j.compbiolchem.2017.09.010
PMID:29031869
Abstract

Gene ontology (GO) is a standardized and controlled vocabulary of terms that describe the molecular functions, biological roles and cellular locations of proteins. GO terms and GO hierarchy are regularly updated as the accumulated biological knowledge. More than 50,000 terms are included in GO and each protein is annotated with several or dozens of these terms. Therefore, accurately predicting the association between proteins and massive GO terms is rather challenging. To accurately predict the association between massive GO terms and proteins, we proposed a method called Hashing GO for protein function prediction (HashGO in short). HashGO firstly adopts a protein-term association matrix to store available GO annotations of proteins. Then, it tailors a graph hashing method to explore the underlying structure between GO terms and to obtain a series of hash functions to compress the high-dimensional protein-term association matrix into a low-dimensional one. Next, HashGO computes the semantic similarity between proteins based on Hamming distance on that low-dimensional matrix. After that, it predicts missing annotations of a protein based on the annotations of its semantic neighbors. Experimental results on archived GO annotations of two model species (Yeast and Human) show that HashGO not only more accurately predicts functions than other related approaches, but also runs faster than them.

摘要

基因本体论(GO)是一个标准化且经过控制的术语词汇表,用于描述蛋白质的分子功能、生物学作用和细胞定位。随着生物学知识的不断积累,GO术语和GO层次结构会定期更新。GO中包含超过50,000个术语,每个蛋白质都用其中的几个或几十个术语进行注释。因此,准确预测蛋白质与大量GO术语之间的关联颇具挑战性。为了准确预测大量GO术语与蛋白质之间的关联,我们提出了一种名为“用于蛋白质功能预测的哈希GO”(简称为HashGO)的方法。HashGO首先采用蛋白质-术语关联矩阵来存储蛋白质可用的GO注释。然后,它定制了一种图哈希方法来探索GO术语之间的潜在结构,并获得一系列哈希函数,将高维蛋白质-术语关联矩阵压缩为低维矩阵。接下来,HashGO基于该低维矩阵上的汉明距离计算蛋白质之间的语义相似性。之后,它根据蛋白质语义邻居的注释预测该蛋白质缺失的注释。对两种模式生物(酵母和人类)的存档GO注释进行的实验结果表明,HashGO不仅比其他相关方法更准确地预测功能,而且运行速度也比它们更快。

相似文献

1
HashGO: hashing gene ontology for protein function prediction.HashGO:用于蛋白质功能预测的基因本体哈希法
Comput Biol Chem. 2017 Dec;71:264-273. doi: 10.1016/j.compbiolchem.2017.09.010. Epub 2017 Oct 4.
2
Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.基于基因本体层次结构保持哈希的基因功能预测。
Genomics. 2019 May;111(3):334-342. doi: 10.1016/j.ygeno.2018.02.008. Epub 2018 Feb 23.
3
Predicting protein function via downward random walks on a gene ontology.通过在基因本体上进行向下随机游走预测蛋白质功能。
BMC Bioinformatics. 2015 Aug 27;16:271. doi: 10.1186/s12859-015-0713-y.
4
Interspecies gene function prediction using semantic similarity.基于语义相似性的跨物种基因功能预测
BMC Syst Biol. 2016 Dec 23;10(Suppl 4):121. doi: 10.1186/s12918-016-0361-5.
5
NMFGO: Gene Function Prediction via Nonnegative Matrix Factorization with Gene Ontology.NMFGO:基于基因本体论的非负矩阵分解进行基因功能预测。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):238-249. doi: 10.1109/TCBB.2018.2861379. Epub 2018 Jul 30.
6
NoisyGOA: Noisy GO annotations prediction using taxonomic and semantic similarity.NoisyGOA:利用分类学和语义相似性预测有噪声的基因本体注释
Comput Biol Chem. 2016 Dec;65:203-211. doi: 10.1016/j.compbiolchem.2016.09.005. Epub 2016 Sep 13.
7
Predicting functions of maize proteins using graph convolutional network.利用图卷积网络预测玉米蛋白的功能。
BMC Bioinformatics. 2020 Dec 16;21(Suppl 16):420. doi: 10.1186/s12859-020-03745-6.
8
A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。
BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.
9
Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph.使用信息内容和基因本体论图的拓扑属性评估蛋白质之间的语义相似性。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):839-849. doi: 10.1109/TCBB.2017.2689762. Epub 2017 Mar 31.
10
NewGOA: Predicting New GO Annotations of Proteins by Bi-Random Walks on a Hybrid Graph.NewGOA:基于混合图双随机游走的蛋白质新 GO 注释预测。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Jul-Aug;15(4):1390-1402. doi: 10.1109/TCBB.2017.2715842. Epub 2017 Jun 15.

引用本文的文献

1
MultiScale-CNN-4mCPred: a multi-scale CNN and adaptive embedding-based method for mouse genome DNA N4-methylcytosine prediction.多尺度 CNN-4mCPred:一种基于多尺度 CNN 和自适应嵌入的方法,用于预测小鼠基因组 DNA N4-甲基胞嘧啶。
BMC Bioinformatics. 2023 Jan 18;24(1):21. doi: 10.1186/s12859-023-05135-0.
2
Predicting functions of maize proteins using graph convolutional network.利用图卷积网络预测玉米蛋白的功能。
BMC Bioinformatics. 2020 Dec 16;21(Suppl 16):420. doi: 10.1186/s12859-020-03745-6.
3
A Literature Review of Gene Function Prediction by Modeling Gene Ontology.
基于基因本体建模的基因功能预测文献综述
Front Genet. 2020 Apr 24;11:400. doi: 10.3389/fgene.2020.00400. eCollection 2020.
4
Predicting Protein Functions Based on Differential Co-expression and Neighborhood Analysis.基于差异共表达和邻域分析预测蛋白质功能。
J Comput Biol. 2021 Jan;28(1):1-18. doi: 10.1089/cmb.2019.0120. Epub 2020 Apr 17.
5
Improving protein function prediction using protein sequence and GO-term similarities.利用蛋白质序列和 GO 术语相似性提高蛋白质功能预测。
Bioinformatics. 2019 Apr 1;35(7):1116-1124. doi: 10.1093/bioinformatics/bty751.