生物创意任务1A：基因提及发现评估。

BioCreAtIvE task 1A: gene mention finding evaluation.

作者信息

Yeh Alexander, Morgan Alexander, Colosimo Marc, Hirschman Lynette

机构信息

The MITRE Corporation, 202 Burlington Road, Bedford, MA 01730, USA.

出版信息

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-6-S1-S2. Epub 2005 May 24.

DOI:10.1186/1471-2105-6-S1-S2

PMID:15960832

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1869012/

Abstract

BACKGROUND

The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an increasing amount of work on text mining this literature, but comparing this work is hard because of a lack of standards for making comparisons. To address this, we worked with colleagues at the Protein Design Group, CNB-CSIC, Madrid to develop BioCreAtIvE (Critical Assessment for Information Extraction in Biology), an open common evaluation of systems on a number of biological text mining tasks. We report here on task 1A, which deals with finding mentions of genes and related entities in text. "Finding mentions" is a basic task, which can be used as a building block for other text mining tasks. The task makes use of data and evaluation software provided by the (US) National Center for Biotechnology Information (NCBI).

RESULTS

15 teams took part in task 1A. A number of teams achieved scores over 80% F-measure (balanced precision and recall). The teams that tried to use their task 1A systems to help on other BioCreAtIvE tasks reported mixed results.

CONCLUSION

The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire.

摘要

背景

生物学研究文献是知识的主要宝库。随着文献数量的增加，要找到特定主题的相关信息变得更加困难。目前已有越来越多关于对这类文献进行文本挖掘的工作，但由于缺乏用于比较的标准，比较这些工作存在困难。为解决这一问题，我们与马德里西班牙国家研究委员会蛋白质设计小组的同事合作，开发了BioCreAtIvE（生物学信息提取关键评估），这是对一系列生物学文本挖掘任务的系统进行的公开通用评估。我们在此报告任务1A，该任务涉及在文本中查找基因及相关实体的提及。“查找提及”是一项基本任务，可作为其他文本挖掘任务的构建基础。该任务使用了（美国）国家生物技术信息中心（NCBI）提供的数据和评估软件。

结果

15个团队参与了任务1A。一些团队的F值（平衡精确率和召回率）超过了80%。那些试图使用其任务1A系统来辅助完成其他BioCreAtIvE任务的团队，结果参差不齐。

结论

超过80%的F值结果不错，但仍略落后于诸如新闻专线等其他领域所取得的最佳分数，部分原因在于与新闻专线中的人名或组织名相比，基因名称的复杂性和长度。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

生物创意任务1A：基因提及发现评估。

BioCreAtIvE task 1A: gene mention finding evaluation.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

生物创意任务1A：基因提及发现评估。

BioCreAtIvE task 1A: gene mention finding evaluation.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献