Landolsi Mohamed Yassine, Hlaoua Lobna, Ben Romdhane Lotfi
MARS Research Laboratory, SDM Research Group, ISITCom, University of Sousse, Hammam Sousse, Tunisia.
Knowl Inf Syst. 2023;65(2):463-516. doi: 10.1007/s10115-022-01779-1. Epub 2022 Nov 8.
In the medical field, a doctor must have a comprehensive knowledge by reading and writing narrative documents, and he is responsible for every decision he takes for patients. Unfortunately, it is very tiring to read all necessary information about drugs, diseases and patients due to the large amount of documents that are increasing every day. Consequently, so many medical errors can happen and even kill people. Likewise, there is such an important field that can handle this problem, which is the information extraction. There are several important tasks in this field to extract the important and desired information from unstructured text written in natural language. The main principal tasks are named entity recognition and relation extraction since they can structure the text by extracting the relevant information. However, in order to treat the narrative text we should use natural language processing techniques to extract useful information and features. In our paper, we introduce and discuss the several techniques and solutions used in these tasks. Furthermore, we outline the challenges in information extraction from medical documents. In our knowledge, this is the most comprehensive survey in the literature with an experimental analysis and a suggestion for some uncovered directions.
在医学领域,医生必须通过阅读和撰写叙述性文档来掌握全面的知识,并且要对为患者做出的每一个决定负责。不幸的是,由于每天都在增加的大量文档,阅读所有关于药物、疾病和患者的必要信息非常累人。因此,会发生如此多的医疗差错,甚至导致患者死亡。同样,有一个非常重要的领域可以处理这个问题,即信息提取。在这个领域中有几个重要任务,是从用自然语言编写的非结构化文本中提取重要且所需的信息。主要的关键任务是命名实体识别和关系提取,因为它们可以通过提取相关信息来构建文本结构。然而,为了处理叙述性文本,我们应该使用自然语言处理技术来提取有用的信息和特征。在我们的论文中,我们介绍并讨论了这些任务中使用的几种技术和解决方案。此外,我们概述了从医学文档中提取信息时面临的挑战。据我们所知,这是文献中最全面的综述,包含实验分析以及对一些未涉及方向的建议。