Rubio-López Ignacio, Costumero Roberto, Ambit Héctor, Gonzalo-Martín Consuelo, Menasalvas Ernestina, Rodríguez González Alejandro
Universidad Politécnica de Madrid, Centro de Tecnología Biomédica, Spain.
Stud Health Technol Inform. 2017;235:251-255.
Electronic Health Records (EHRs) are now being massively used in hospitals what has motivated current developments of new methods to process clinical narratives (unstructured data) making it possible to perform context-based searches. Current approaches to process the unstructured texts in EHRs are based in applying text mining or natural language processing (NLP) techniques over the data. In particular Named Entity Recognition (NER) is of paramount importance to retrieve specific biomedical concepts from the text providing the semantic type of the concept retrieved. However, it is very common that clinical notes contain lots of acronyms that cannot be identified by NER processes and even if they are identified, an acronym may correspond to several meanings, so disambiguation of the found term is needed. In this work we provide an approach to perform acronym disambiguation in Spanish EHR using machine learning techniques.
电子健康记录(EHRs)如今在医院中被大量使用,这推动了当前处理临床叙述(非结构化数据)新方法的发展,使得基于上下文的搜索成为可能。目前处理EHRs中非结构化文本的方法是基于对数据应用文本挖掘或自然语言处理(NLP)技术。特别是命名实体识别(NER)对于从文本中检索特定生物医学概念并提供所检索概念的语义类型至关重要。然而,临床记录中包含许多首字母缩略词是很常见的,这些首字母缩略词无法通过NER流程识别,即使它们被识别出来,一个首字母缩略词可能对应多种含义,因此需要对找到的术语进行消歧。在这项工作中,我们提供了一种使用机器学习技术在西班牙语EHR中进行首字母缩略词消歧的方法。