Suppr超能文献

基于上下文的生物实体多令牌概念识别。

Context-aware multi-token concept recognition of biological entities.

机构信息

Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea.

Bio-Synergy Research Center, Daejeon, South Korea.

出版信息

BMC Bioinformatics. 2021 Oct 21;22(Suppl 11):337. doi: 10.1186/s12859-021-04248-8.

Abstract

BACKGROUND

Concept recognition is a term that corresponds to the two sequential steps of named entity recognition and named entity normalization, and plays an essential role in the field of bioinformatics. However, the conventional dictionary-based methods did not sufficiently addressed the variation of the concepts in actual use in literature, resulting in the particularly degraded performances in recognition of multi-token concepts.

RESULTS

In this paper, we propose a concept recognition method of multi-token biological entities using neural models combined with literature contexts. The key aspect of our method is utilizing the contextual information from the biological knowledge-bases for concept normalization, which is followed by named entity recognition procedure. The model showed improved performances over conventional methods, particularly for multi-token concepts with higher variations.

CONCLUSIONS

We expect that our model can be utilized for effective concept recognition and variety of natural language processing tasks on bioinformatics.

摘要

背景

概念识别是一个术语,对应于命名实体识别和命名实体规范化的两个连续步骤,在生物信息学领域中起着至关重要的作用。然而,传统的基于字典的方法并没有充分解决文献中实际使用的概念的变化,导致在识别多令牌概念时性能特别下降。

结果

在本文中,我们提出了一种使用神经模型结合文献上下文的多令牌生物实体概念识别方法。我们方法的关键方面是利用生物知识库中的上下文信息进行概念规范化,然后是命名实体识别过程。该模型在性能上优于传统方法,特别是对于变化较大的多令牌概念。

结论

我们期望我们的模型可以用于生物信息学上的有效概念识别和各种自然语言处理任务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/97cf/8529713/88b1df8aab63/12859_2021_4248_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验