Suppr超能文献

生物语料库中实体注释的分析。

An analysis on the entity annotations in biological corpora.

作者信息

Neves Mariana

机构信息

Hasso-Plattner-Institut, Potsdam Universität, Potsdam, Germany.

出版信息

F1000Res. 2014 Apr 25;3:96. doi: 10.12688/f1000research.3216.1. eCollection 2014.

Abstract

Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.

摘要

带有语义实体和关系注释的文档集合是支持生物医学领域文本挖掘解决方案开发和评估的关键资源。在此,我概述了36个语料库,并对它们包含的语义注释进行了分析。实体类型的注释被分为六个语义组,并展示了每个语料库中可找到的语义实体概述。结果表明,虽然一些语义实体,如基因、蛋白质和化学物质在许多集合中都有一致的注释,但尽管疾病、变异和突变在生物领域很重要,可用于它们的语料库仍然很少。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b410/4168744/f78f457d4964/f1000research-3-3456-g0000.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验