文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Overview of the gene ontology task at BioCreative IV.

作者信息

Mao Yuqing, Van Auken Kimberly, Li Donghui, Arighi Cecilia N, McQuilton Peter, Hayman G Thomas, Tweedie Susan, Schaeffer Mary L, Laulederkind Stanley J F, Wang Shur-Jen, Gobeill Julien, Ruch Patrick, Luu Anh Tuan, Kim Jung-Jae, Chiang Jung-Hsien, Chen Yu-De, Yang Chia-Jung, Liu Hongfang, Zhu Dongqing, Li Yanpeng, Yu Hong, Emadzadeh Ehsan, Gonzalez Graciela, Chen Jian-Ming, Dai Hong-Jie, Lu Zhiyong

机构信息

National Center for Biotechnology Information (NCBI), National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20817, USA WormBase, Division of Biology, California Institute of Technology, 1200 E. California Boulevard, Pasadena, CA 91125, USA, TAIR, Department of Plant Biology, The Arabidopsis Information Resource, Carnegie Institution for Science, Stanford, CA 94305, USA, Center for Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Newark, DE 19711, USA, FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK, Rat Genome Database, Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA, USDA-ARS Plant Genetics Research Unit and Division of Plant Sciences, Department of Agronomy, University of Missouri, Columbia, MO 65211, USA, HES-SO, HEG, Library and Information Sciences, 7 route de Drize, CH-1227 Carouge, Switzerland, SIBtex, Swiss Institute of Bioinformatics, Rue Michel Servet 1, 1211 Geneva 4, Switzerland, School of Computer Engineering, Nanyang Technological University, Block N4, #02a-32, Nanyang Avenue, Singapore 639798, Department of Computer Science and Information Engineering, National Cheng-Kung University, No. 1, University Rd., Tainan 701, Taiwan, Republic of China, Department of Radiology, Mackay Memorial Hospital, Taitung Branch, Lane 303 Chang Sha St. Taitung, Taiwan, Republic of China, Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA, Department of Computer Science, University of Delaware, 101 Smith Hall, Newark, DE 19716, USA, Department of Quantitative Health Sciences, University of Massachusetts Medical School, 55 Lake Avenue North (AC7-059), Worcester, MA 01655 USA, Department of Biomedical Informatics, Arizona State University, 13212 East Shea Boulevard Scottsdale, AZ 85259 USA, Institute of Information Science, Academia Sinica, 128 Academia Road, Secti

出版信息

Database (Oxford). 2014 Aug 25;2014. doi: 10.1093/database/bau086. Print 2014.


DOI:10.1093/database/bau086
PMID:25157073
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4142793/
Abstract

UNLABELLED: Gene ontology (GO) annotation is a common task among model organism databases (MODs) for capturing gene function data from journal articles. It is a time-consuming and labor-intensive task, and is thus often considered as one of the bottlenecks in literature curation. There is a growing need for semiautomated or fully automated GO curation techniques that will help database curators to rapidly and accurately identify gene function information in full-length articles. Despite multiple attempts in the past, few studies have proven to be useful with regard to assisting real-world GO curation. The shortage of sentence-level training data and opportunities for interaction between text-mining developers and GO curators has limited the advances in algorithm development and corresponding use in practical circumstances. To this end, we organized a text-mining challenge task for literature-based GO annotation in BioCreative IV. More specifically, we developed two subtasks: (i) to automatically locate text passages that contain GO-relevant information (a text retrieval task) and (ii) to automatically identify relevant GO terms for the genes in a given article (a concept-recognition task). With the support from five MODs, we provided teams with >4000 unique text passages that served as the basis for each GO annotation in our task data. Such evidence text information has long been recognized as critical for text-mining algorithm development but was never made available because of the high cost of curation. In total, seven teams participated in the challenge task. From the team results, we conclude that the state of the art in automatically mining GO terms from literature has improved over the past decade while much progress is still needed for computer-assisted GO curation. Future work should focus on addressing remaining technical challenges for improved performance of automatic GO concept recognition and incorporating practical benefits of text-mining tools into real-world GO annotation. DATABASE URL: http://www.biocreative.org/tasks/biocreative-iv/track-4-GO/.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8409/4142793/07b587a54c10/bau086f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8409/4142793/07b587a54c10/bau086f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8409/4142793/07b587a54c10/bau086f1p.jpg

相似文献

[1]
Overview of the gene ontology task at BioCreative IV.

Database (Oxford). 2014-8-25

[2]
BC4GO: a full-text corpus for the BioCreative IV GO task.

Database (Oxford). 2014-7-28

[3]
Overview of the BioCreative III Workshop.

BMC Bioinformatics. 2011-10-3

[4]
Evaluation of BioCreAtIvE assessment of task 2.

BMC Bioinformatics. 2005

[5]
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.

BMC Bioinformatics. 2005

[6]
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine.

Database (Oxford). 2019-1-1

[7]
Closing the loop: from paper to protein annotation using supervised Gene Ontology classification.

Database (Oxford). 2014-9-4

[8]
BioCreative III interactive task: an overview.

BMC Bioinformatics. 2011-10-3

[9]
The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.

BMC Bioinformatics. 2011-10-3

[10]
Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Genome Biol. 2008

引用本文的文献

[1]
A conceptual framework for human-AI collaborative genome annotation.

Brief Bioinform. 2025-7-2

[2]
Characterization and automated classification of sentences in the biomedical literature: a case study for biocuration of gene expression and protein kinase activity.

bioRxiv. 2025-1-8

[3]
Integration of background knowledge for automatic detection of inconsistencies in gene ontology annotation.

Bioinformatics. 2024-6-28

[4]
Epigenetic changes in sperm are associated with paternal and child quantitative autistic traits in an autism-enriched cohort.

Mol Psychiatry. 2024-1

[5]
Automatic consistency assurance for literature-based gene ontology annotation.

BMC Bioinformatics. 2021-11-25

[6]
A Fine-Tuned Bidirectional Encoder Representations From Transformers Model for Food Named-Entity Recognition: Algorithm Development and Validation.

J Med Internet Res. 2021-8-9

[7]
ECO-CollecTF: A Corpus of Annotated Evidence-Based Assertions in Biomedical Manuscripts.

Front Res Metr Anal. 2021-7-13

[8]
Biomarker identification of hepatocellular carcinoma using a methodical literature mining strategy.

Database (Oxford). 2017-1-1

[9]
Function Prediction for G Protein-Coupled Receptors through Text Mining and Induction Matrix Completion.

ACS Omega. 2019-2-12

[10]
Accelerating annotation of articles via automated approaches: evaluation of the neXtA5 curation-support tool by neXtProt.

Database (Oxford). 2018-1-1

本文引用的文献

[1]
Comparison and combination of several MeSH indexing approaches.

AMIA Annu Symp Proc. 2013-11-16

[2]
BioC: a minimalist approach to interoperability for biomedical text processing.

Database (Oxford). 2013-9-18

[3]
The COMBREX project: design, methodology, and initial results.

PLoS Biol. 2013-8-27

[4]
A guide to best practices for Gene Ontology (GO) manual annotation.

Database (Oxford). 2013-7-9

[5]
Managing the data deluge: data-driven GO category assignment improves while complexity of functional annotation increases.

Database (Oxford). 2013-7-9

[6]
GeneRIF indexing: sentence selection based on machine learning.

BMC Bioinformatics. 2013-5-31

[7]
PubTator: a web-based text mining tool for assisting biocuration.

Nucleic Acids Res. 2013-5-22

[8]
Large-scale event extraction from literature with multi-level gene normalization.

PLoS One. 2013-4-17

[9]
Use of Gene Ontology Annotation to understand the peroxisome proteome in humans.

Database (Oxford). 2013-1-17

[10]
An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.

Database (Oxford). 2013-1-17

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索