• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用文本挖掘实现功能注释描述的统一。

Unification of functional annotation descriptions using text mining.

机构信息

Systems Ecology, Esch-sur-Alzette, Luxembourg.

Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 4362, Esch-sur-Alzette, Luxembourg.

出版信息

Biol Chem. 2021 May 13;402(8):983-990. doi: 10.1515/hsz-2021-0125. Print 2021 Jul 27.

DOI:10.1515/hsz-2021-0125
PMID:33984880
Abstract

A common approach to genome annotation involves the use of homology-based tools for the prediction of the functional role of proteins. The quality of functional annotations is dependent on the reference data used, as such, choosing the appropriate sources is crucial. Unfortunately, no single reference data source can be universally considered the gold standard, thus using multiple references could potentially increase annotation quality and coverage. However, this comes with challenges, particularly due to the introduction of redundant and exclusive annotations. Through text mining it is possible to identify highly similar functional descriptions, thus strengthening the confidence of the final protein functional annotation and providing a redundancy-free output. Here we present UniFunc, a text mining approach that is able to detect similar functional descriptions with high precision. UniFunc was built as a small module and can be independently used or integrated into protein function annotation pipelines. By removing the need to individually analyse and compare annotation results, UniFunc streamlines the complementary use of multiple reference datasets.

摘要

一种常见的基因组注释方法涉及使用基于同源性的工具来预测蛋白质的功能作用。功能注释的质量取决于所使用的参考数据,因此,选择适当的来源至关重要。不幸的是,没有单一的参考数据源可以被普遍认为是黄金标准,因此使用多个参考源可能潜在地提高注释的质量和覆盖范围。然而,这带来了挑战,特别是由于冗余和排他性注释的引入。通过文本挖掘,可以识别高度相似的功能描述,从而增强最终蛋白质功能注释的置信度,并提供无冗余的输出。在这里,我们提出了 UniFunc,这是一种文本挖掘方法,能够以高精度检测相似的功能描述。UniFunc 被构建为一个小型模块,可以独立使用或集成到蛋白质功能注释管道中。通过消除单独分析和比较注释结果的需要,UniFunc 简化了多个参考数据集的互补使用。

相似文献

1
Unification of functional annotation descriptions using text mining.使用文本挖掘实现功能注释描述的统一。
Biol Chem. 2021 May 13;402(8):983-990. doi: 10.1515/hsz-2021-0125. Print 2021 Jul 27.
2
PANNZER: high-throughput functional annotation of uncharacterized proteins in an error-prone environment.PANNZER:在易出错环境中对未表征蛋白质进行高通量功能注释。
Bioinformatics. 2015 May 15;31(10):1544-52. doi: 10.1093/bioinformatics/btu851. Epub 2015 Jan 8.
3
Finding Gene Associations by Text Mining and Annotating it with Gene Ontology.通过文本挖掘发现基因关联,并使用基因本体论对其进行注释。
Methods Mol Biol. 2022;2496:71-90. doi: 10.1007/978-1-0716-2305-3_4.
4
Mantis: flexible and consensus-driven genome annotation.螳螂:灵活且基于共识的基因组注释。
Gigascience. 2021 Jun 2;10(6). doi: 10.1093/gigascience/giab042.
5
Integrating protein-protein interactions and text mining for protein function prediction.整合蛋白质-蛋白质相互作用和文本挖掘进行蛋白质功能预测。
BMC Bioinformatics. 2008 Jul 22;9 Suppl 8(Suppl 8):S2. doi: 10.1186/1471-2105-9-S8-S2.
6
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine.BioCreative VI 精准医学赛道概述:精准医学中的蛋白质相互作用和突变挖掘。
Database (Oxford). 2019 Jan 1;2019:bay147. doi: 10.1093/database/bay147.
7
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.基于磷酸化蛋白质相互作用网络的文本挖掘的 eFIP 系统。
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012.
8
How to link ontologies and protein-protein interactions to literature: text-mining approaches and the BioCreative experience.如何将本体和蛋白质-蛋白质相互作用与文献联系起来:文本挖掘方法和 BioCreative 的经验。
Database (Oxford). 2012 Mar 21;2012:bas017. doi: 10.1093/database/bas017. Print 2012.
9
LocText: relation extraction of protein localizations to assist database curation.蛋白质定位的关系提取以辅助数据库编纂。
BMC Bioinformatics. 2018 Jan 17;19(1):15. doi: 10.1186/s12859-018-2021-9.
10
Text mining improves prediction of protein functional sites.文本挖掘提高了蛋白质功能位点的预测能力。
PLoS One. 2012;7(2):e32171. doi: 10.1371/journal.pone.0032171. Epub 2012 Feb 29.

引用本文的文献

1
A Survey of Biological Function Prediction Methods with Focus on Natural Language Processing (NLP) and Large Language Models (LLM).以自然语言处理(NLP)和大语言模型(LLM)为重点的生物功能预测方法综述。
Methods Mol Biol. 2025;2941:201-225. doi: 10.1007/978-1-0716-4623-6_13.
2
Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools.序列储备的功能分析:基于蛋白质对的计算机预测工具评估
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf035.
3
Genus-Wide Transcriptional Landscapes Reveal Correlated Gene Networks Underlying Microevolutionary Divergence in Diatoms.
属水平转录组图谱揭示了硅藻微观进化分歧中相关基因网络的基础。
Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad218.
4
VEBA: a modular end-to-end suite for in silico recovery, clustering, and analysis of prokaryotic, microeukaryotic, and viral genomes from metagenomes.VEBA:一个用于元基因组中细菌、微真核生物和病毒基因组的从头组装、聚类和分析的模块化端到端套件。
BMC Bioinformatics. 2022 Oct 12;23(1):419. doi: 10.1186/s12859-022-04973-8.