Suppr超能文献

基因本体注释工具:将蛋白质的基因本体注释与证据文本相链接。

GOAnnotator: linking protein GO annotations to evidence text.

作者信息

Couto Francisco M, Silva Mário J, Lee Vivian, Dimmer Emily, Camon Evelyn, Apweiler Rolf, Kirsch Harald, Rebholz-Schuhmann Dietrich

机构信息

Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, Portugal.

出版信息

J Biomed Discov Collab. 2006 Dec 20;1:19. doi: 10.1186/1747-5333-1-19.

Abstract

BACKGROUND

Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators.

RESULTS

In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins.

CONCLUSION

The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.

摘要

背景

用基因本体论(GO)术语对蛋白质进行注释是一项正在进行的工作,也是一项复杂的任务。手动进行GO注释精确且珍贵,但耗时较长。因此,大多数蛋白质的注释并非经过精心策划,而是自动生成的非精心策划注释。虽然已经提出了利用文献进行自动注释的文本挖掘系统,但它们无法满足注释人员对高质量的期望。

结果

在本文中我们描述了一种将非精心策划注释与从文献中提取的文本相联系的方法。文本的选择基于该文本与非精心策划注释中的术语的相似度。除了证实非精心策划注释外,提取的文本还能产生新的注释。此外,该方法利用GO层次结构来实现高精度。我们的方法已集成到GOAnnotator工具中,该工具可协助对UniProt蛋白质进行GO注释的策划过程。

结论

GO注释人员使用一组66个带有非精心策划注释的不同UniProt/SwissProt蛋白质对GOAnnotator进行了评估。GOAnnotator提供的正确证据文本的精度达到93%。这种高精度源于利用GO层次结构仅从GOA中的非精心策划注释中选择与GO术语相似的GO术语。我们的方法是首个实现高精度的方法,这对于高效支持GO注释人员至关重要。GOAnnotator被实现为一个网络工具,可从http://xldb.di.fc.ul.pt/rebil/tools/goa/免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d5a/1769513/fe886b1533e3/1747-5333-1-19-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验