• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
GO2Sum: Generating Human Readable Functional Summary of Proteins from GO Terms.GO2Sum:从基因本体术语生成蛋白质的人类可读功能摘要。
bioRxiv. 2023 Nov 15:2023.11.10.566665. doi: 10.1101/2023.11.10.566665.
2
GO2Sum: generating human-readable functional summary of proteins from GO terms.GO2Sum:从 GO 术语生成人类可读的蛋白质功能摘要。
NPJ Syst Biol Appl. 2024 Mar 15;10(1):29. doi: 10.1038/s41540-024-00358-0.
3
GOnet: a tool for interactive Gene Ontology analysis.GOnet:一个用于交互式基因本体论分析的工具。
BMC Bioinformatics. 2018 Dec 7;19(1):470. doi: 10.1186/s12859-018-2533-3.
4
The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.基因本体注释(GOA)数据库:在UniProt中与基因本体共享知识。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D262-6. doi: 10.1093/nar/gkh021.
5
GO-Module: functional synthesis and improved interpretation of Gene Ontology patterns.GO-Module:功能综合与基因本体论模式的改进解释。
Bioinformatics. 2011 May 15;27(10):1444-6. doi: 10.1093/bioinformatics/btr142. Epub 2011 Mar 17.
6
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.对生物创意(BioCreAtIvE)和基因本体注释(GOA)的基因本体(GO)注释检索的评估。
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.
7
A drug target slim: using gene ontology and gene ontology annotations to navigate protein-ligand target space in ChEMBL.药物靶点精简:利用基因本体论和基因本体注释在ChEMBL中探索蛋白质-配体靶点空间
J Biomed Semantics. 2016 Sep 27;7(1):59. doi: 10.1186/s13326-016-0102-0.
8
Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.比较 GO:一个用于细菌中比较基因本体论和基于基因本体论的基因选择的网络应用程序。
PLoS One. 2013;8(3):e58759. doi: 10.1371/journal.pone.0058759. Epub 2013 Mar 11.
9
HashGO: hashing gene ontology for protein function prediction.HashGO:用于蛋白质功能预测的基因本体哈希法
Comput Biol Chem. 2017 Dec;71:264-273. doi: 10.1016/j.compbiolchem.2017.09.010. Epub 2017 Oct 4.
10
NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology.NaviGO:用于基因本体可视化以及功能相似性和连贯性分析的交互式工具。
BMC Bioinformatics. 2017 Mar 20;18(1):177. doi: 10.1186/s12859-017-1600-5.

GO2Sum:从基因本体术语生成蛋白质的人类可读功能摘要。

GO2Sum: Generating Human Readable Functional Summary of Proteins from GO Terms.

作者信息

Giri Swagarika Jaharlal, Ibtehaz Nabil, Kihara Daisuke

机构信息

Department of Computer Science, Purdue University, West Lafayette, IN, United States.

Department of Biological Sciences, Purdue University, West Lafayette, IN, United States.

出版信息

bioRxiv. 2023 Nov 15:2023.11.10.566665. doi: 10.1101/2023.11.10.566665.

DOI:10.1101/2023.11.10.566665
PMID:38014080
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10680659/
Abstract

Understanding the biological functions of proteins is of fundamental importance in modern biology. To represent function of proteins, Gene Ontology (GO), a controlled vocabulary, is frequently used, because it is easy to handle by computer programs avoiding open-ended text interpretation. Particularly, the majority of current protein function prediction methods rely on GO terms. However, the extensive list of GO terms that describe a protein function can pose challenges for biologists when it comes to interpretation. In response to this issue, we developed GO2Sum (Gene Ontology terms Summarizer), a model that takes a set of GO terms as input and generates a human-readable summary using the T5 large language model. GO2Sum was developed by fine-tuning T5 on GO term assignments and free-text function descriptions for UniProt entries, enabling it to recreate function descriptions by concatenating GO term descriptions. Our results demonstrated that GO2Sum significantly outperforms the original T5 model that was trained on the entire web corpus in generating Function, Subunit Structure, and Pathway paragraphs for UniProt entries.

摘要

了解蛋白质的生物学功能在现代生物学中至关重要。为了表示蛋白质的功能,基因本体论(GO),一种受控词汇表,经常被使用,因为它易于计算机程序处理,避免了开放式文本解释。特别是,当前大多数蛋白质功能预测方法都依赖于GO术语。然而,描述蛋白质功能的大量GO术语列表在解释方面可能给生物学家带来挑战。针对这个问题,我们开发了GO2Sum(基因本体论术语汇总器),这是一个以一组GO术语为输入,并使用T5大语言模型生成人类可读摘要的模型。GO2Sum是通过在UniProt条目的GO术语分配和自由文本功能描述上对T5进行微调而开发的,使其能够通过连接GO术语描述来重新创建功能描述。我们的结果表明,在为UniProt条目生成功能、亚基结构和途径段落方面,GO2Sum明显优于在整个网络语料库上训练的原始T5模型。