Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy.
Catholic University of the Sacred Heart, Rome, Italy.
Stud Health Technol Inform. 2024 Aug 22;316:909-913. doi: 10.3233/SHTI240559.
Electronic Health Records (EHRs) contain a wealth of unstructured patient data, making it challenging for physicians to do informed decisions. In this paper, we introduce a Natural Language Processing (NLP) approach for the extraction of therapies, diagnosis, and symptoms from ambulatory EHRs of patients with chronic Lupus disease. We aim to demonstrate the effort of a comprehensive pipeline where a rule-based system is combined with text segmentation, transformer-based topic analysis and clinical ontology, in order to enhance text preprocessing and automate rules' identification. Our approach is applied on a sub-cohort of 56 patients, with a total of 750 EHRs written in Italian language, achieving an Accuracy and an F-score over 97% and 90% respectively, in the three extracted domains. This work has the potential to be integrated with EHR systems to automate information extraction, minimizing the human intervention, and providing personalized digital solutions in the chronic Lupus disease domain.
电子健康记录 (EHR) 包含大量非结构化的患者数据,这使得医生难以做出明智的决策。在本文中,我们介绍了一种自然语言处理 (NLP) 方法,用于从慢性狼疮病患者的门诊 EHR 中提取治疗方法、诊断和症状。我们旨在展示一个全面的管道的努力,其中结合了基于规则的系统、文本分割、基于转换器的主题分析和临床本体,以增强文本预处理和自动化规则识别。我们的方法应用于 56 名患者的子队列,共 750 份用意大利语书写的 EHR,在三个提取的领域中分别达到了超过 97%和 90%的准确性和 F 分数。这项工作有可能与 EHR 系统集成,以实现信息提取的自动化,最大限度地减少人工干预,并为慢性狼疮病领域提供个性化的数字解决方案。