• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于蛋白质知识的 GO 注释预测的分层深度学习

Hierarchical deep learning for predicting GO annotations by integrating protein knowledge.

机构信息

Bioengineering and Bioinformatics Research and Development Institute (IBB), FI-UNER, CONICET, Oro Verde 3100, Argentina.

Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Ciudad Universitaria UNL, Santa Fe 3000, Argentina.

出版信息

Bioinformatics. 2022 Sep 30;38(19):4488-4496. doi: 10.1093/bioinformatics/btac536.

DOI:10.1093/bioinformatics/btac536
PMID:35929781
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9524999/
Abstract

MOTIVATION

Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet.

RESULTS

We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations.

AVAILABILITY AND IMPLEMENTATION

DeeProtGO and a case of use are available at https://github.com/gamerino/DeeProtGO.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

实验测试和人工注释是为蛋白质功能分配描述基因本体论 (GO) 术语的最精确方法。然而,它们既昂贵又耗时,并且无法应对高通量测序方法生成的数据的指数级增长。因此,研究人员需要可靠的计算系统来帮助填补自动功能预测的空白。上一次功能注释评估挑战赛的结果表明,GO 术语预测仍然是一项极具挑战性的任务。深度学习的最新发展通过整合来自多个来源的数据,大大突破了导致蛋白质研究新知识的前沿。然而,迄今为止为功能预测开发的深度模型主要侧重于序列数据,并且尚未取得突破性的性能。

结果

我们提出了 DeeProtGO,这是一种通过整合蛋白质知识来预测 GO 注释的新型深度学习模型。DeeProtGO 经过训练可解决 18 种不同的预测问题,这些问题由三个 GO 子本体、蛋白质类型和分类单元定义。当整合更多蛋白质知识时,我们报告了更高的预测质量。我们还在公共数据集上针对最先进的方法对 DeeProtGO 进行了基准测试,并表明它可以有效地改进 GO 注释的预测。

可用性和实现

DeeProtGO 和一个用例可在 https://github.com/gamerino/DeeProtGO 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/1735f82d799f/btac536f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/5f86db6bcb3d/btac536f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/190fb1a2792e/btac536f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/1b9a5b0b46a7/btac536f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/8e7b1ed0fbae/btac536f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/1735f82d799f/btac536f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/5f86db6bcb3d/btac536f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/190fb1a2792e/btac536f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/1b9a5b0b46a7/btac536f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/8e7b1ed0fbae/btac536f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5252/9524999/1735f82d799f/btac536f5.jpg

相似文献

1
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge.基于蛋白质知识的 GO 注释预测的分层深度学习
Bioinformatics. 2022 Sep 30;38(19):4488-4496. doi: 10.1093/bioinformatics/btac536.
2
exp2GO: Improving Prediction of Functions in the Gene Ontology With Expression Data.exp2GO:利用表达数据改进基因本体中功能的预测
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):999-1008. doi: 10.1109/TCBB.2022.3167245. Epub 2023 Apr 3.
3
PFresGO: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships.PFresGO:一种基于注意力机制的深度学习方法,通过整合基因本体论的相互关系来进行蛋白质注释。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad094.
4
Mutual annotation-based prediction of protein domain functions with Domain2GO.基于互注释的蛋白质结构域功能预测与 Domain2GO。
Protein Sci. 2024 Jun;33(6):e4988. doi: 10.1002/pro.4988.
5
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.DeepGO:使用深度本体感知分类器从序列和相互作用预测蛋白质功能。
Bioinformatics. 2018 Feb 15;34(4):660-668. doi: 10.1093/bioinformatics/btx624.
6
Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations.Onto2Vec:基于向量的生物实体联合表示及其基于本体论的标注。
Bioinformatics. 2018 Jul 1;34(13):i52-i60. doi: 10.1093/bioinformatics/bty259.
7
Co-complex protein membership evaluation using Maximum Entropy on GO ontology and InterPro annotation.使用最大熵方法对 GO 本体论和 InterPro 注释进行共复合物蛋白成员评估。
Bioinformatics. 2018 Jun 1;34(11):1884-1892. doi: 10.1093/bioinformatics/btx803.
8
Improving protein function prediction using protein sequence and GO-term similarities.利用蛋白质序列和 GO 术语相似性提高蛋白质功能预测。
Bioinformatics. 2019 Apr 1;35(7):1116-1124. doi: 10.1093/bioinformatics/bty751.
9
Protein Function Prediction With Functional and Topological Knowledge of Gene Ontology.基于基因本体论的功能和拓扑知识的蛋白质功能预测。
IEEE Trans Nanobioscience. 2023 Oct;22(4):755-762. doi: 10.1109/TNB.2023.3278033. Epub 2023 Oct 3.
10
Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training.基于多任务协同训练的蛋白质多标签亚细胞定位和功能预测深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae568.

引用本文的文献

1
The CABANA model 2017-2022: research and training synergy to facilitate bioinformatics applications in Latin America.2017 - 2022年CABANA模型:促进生物信息学在拉丁美洲应用的研究与培训协同作用。
Front Educ (Lausanne). 2024 Jul 4;9. doi: 10.3389/feduc.2024.1358620.
2
Optimizing Scorpion Toxin Processing through Artificial Intelligence.通过人工智能优化蝎毒素处理。
Toxins (Basel). 2024 Oct 11;16(10):437. doi: 10.3390/toxins16100437.
3
Osmoprotectants play a major role in the resistance to high levels of salinity stress-insights from a metabolomics and proteomics integrated approach.

本文引用的文献

1
Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14.通过深度学习和距离预测改进 CASP14 中的蛋白质三级结构预测。
Proteins. 2022 Jan;90(1):58-72. doi: 10.1002/prot.26186. Epub 2021 Jul 27.
2
DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction.DeepGraphGO:用于大规模多物种蛋白质功能预测的图神经网络。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i262-i271. doi: 10.1093/bioinformatics/btab270.
3
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.
渗透保护剂在抵抗高盐胁迫中起主要作用——来自代谢组学和蛋白质组学整合方法的见解
Front Plant Sci. 2023 Jun 13;14:1187803. doi: 10.3389/fpls.2023.1187803. eCollection 2023.
4
PFresGO: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships.PFresGO:一种基于注意力机制的深度学习方法,通过整合基因本体论的相互关系来进行蛋白质注释。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad094.
5
Elucidating the functional roles of prokaryotic proteins using big data and artificial intelligence.利用大数据和人工智能阐明原核蛋白的功能作用。
FEMS Microbiol Rev. 2023 Jan 16;47(1). doi: 10.1093/femsre/fuad003.
ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
4
TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding.TALE:基于 Transformer 的蛋白质功能注释与联合序列-标签嵌入。
Bioinformatics. 2021 Sep 29;37(18):2825-2833. doi: 10.1093/bioinformatics/btab198.
5
Embeddings from deep learning transfer GO annotations beyond homology.深度学习的嵌入信息可以将 GO 注释扩展到同源之外。
Sci Rep. 2021 Jan 13;11(1):1160. doi: 10.1038/s41598-020-80786-0.
6
Automatic Gene Function Prediction in the 2020's.21 世纪的自动基因功能预测。
Genes (Basel). 2020 Oct 27;11(11):1264. doi: 10.3390/genes11111264.
7
Deep learning for mining protein data.深度学习在蛋白质数据挖掘中的应用。
Brief Bioinform. 2021 Jan 18;22(1):194-218. doi: 10.1093/bib/bbz156.
8
Complexity measures of the mature miRNA for improving pre-miRNAs prediction.成熟 miRNA 的复杂性度量可提高前体 miRNA 的预测。
Bioinformatics. 2020 Apr 15;36(8):2319-2327. doi: 10.1093/bioinformatics/btz940.
9
Modeling aspects of the language of life through transfer-learning protein sequences.通过转移学习蛋白质序列来模拟生命语言的各个方面。
BMC Bioinformatics. 2019 Dec 17;20(1):723. doi: 10.1186/s12859-019-3220-8.
10
The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.CAFA 挑战赛报告称,通过实验筛选,提高了数百个基因的蛋白质功能预测和新的功能注释。
Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.