Jovanović Jelena, Bagheri Ebrahim
Department of Software Engineering, University of Belgrade, 154 Jove Ilica Street, Belgrade, Serbia.
Department of Electrical Engineering, Ryerson University, 245 Church Street, Toronto, Canada.
J Biomed Semantics. 2017 Sep 22;8(1):44. doi: 10.1186/s13326-017-0153-x.
The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.
生物医学文本数量众多且结构松散,无论是临床内容还是研究内容,都给有效利用此类文本中存储的信息和知识带来了巨大挑战。用机器可理解的语义对生物医学文档进行注释有助于实现基于语义的高级文本管理、编目、索引和搜索。本文重点关注使用诸如统一医学语言系统(UMLS)等相关生物医学知识库中的概念对生物医学实体提及进行注释。这样一来,这些提及的含义就得到了明确无误的定义,从而便于进行自动化处理。这个过程被广泛称为语义注释,执行该过程的工具被称为语义注释器。
在过去的十几年里,生物医学研究界在生物医学语义注释技术的开发上投入了大量精力。为了为该领域的进一步发展奠定基础,我们回顾了一组精选的生物医学语义注释器的最新技术,特别关注通用注释器,即可以定制以处理来自生物医学任何领域文本的语义注释工具。我们还研究了当今注释器进一步改进的潜在方向,以使它们更能满足实际应用的需求。为了激励和鼓励沿着建议的和/或相关方向在该领域进一步发展,我们回顾了语义注释器现有的和潜在的实际应用及益处。