Vanalli M, Cesare M, Cocchieri A, D'Agostino F
Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy.
Fondazione Policlinico Universitario A. Gemelli IRCCS, University of Catholic Sacred Heart, Rome, Italy.
Ann Ig. 2023 Jan-Feb;35(1):3-20. doi: 10.7416/ai.2022.2517. Epub 2022 Apr 12.
Nurses record data in electronic health records (EHRs) using different terminologies and coding systems. The purpose of this study was to identify unstructured free-text nursing activities recorded by nurses in EHRs with natural language processing (NLP) techniques and to map these nursing activities into standard nursing activities using the SMASH method.
A retrospective study using NLP techniques with a unidirectional mapping strategy called SMASH.
The unstructured free-text nursing activities recorded in the Medicine, Neurology and Gastroenterology inpatient units of the Agostino Gemelli IRCCS University Hospital Foundation, Rome, Italy were collected for 6 months in 2018. Data were analyzed by three phases: a) text summarization component with NLP techniques, b) a consensus analysis by four experts to detect the category of word stems, and c) cross-mapping with SMASH. The SMASH method calculated the string comparison, similarity and distance of words through the Levenshtein distance (LD), Jaro-Winker distance and the cross-mapping's cut-offs: map [0.80-1.00] with < 13 LD, partial-map [0.50-0.79] with <13 LD and no map [0.0-0.49] with >13 LD.
During the study period, 491 patient records were assessed. 548 different unstructured free-text nursing activities were recorded by nurses. 451 unstructured free-text nursing activities (82.3%) were mapped to standard PAI nursing activities, 47 (8.7%) were partial mapped, while 50 (9.0%) were not mapped. This automated mapping yielded recall of 0.95%, precision of 0.94%, accuracy of 0.91%, F-measure of 0.96. The F-measure indicates good reliability of this automated procedure in cross-mapping.
Lexical similarities between unstructured free-text nursing activities and standard nursing activities were found, NLP with the SMASH method is a feasible approach to extract data related to nursing concepts that are not recorded through structured data entry.
护士使用不同的术语和编码系统在电子健康记录(EHRs)中记录数据。本研究的目的是使用自然语言处理(NLP)技术识别护士在EHRs中记录的非结构化自由文本护理活动,并使用SMASH方法将这些护理活动映射到标准护理活动中。
一项使用NLP技术和名为SMASH的单向映射策略的回顾性研究。
收集了2018年意大利罗马阿戈斯蒂诺·杰梅利IRCCS大学医院基金会内科、神经科和胃肠病科住院单元记录的6个月非结构化自由文本护理活动。数据分三个阶段进行分析:a)使用NLP技术的文本摘要组件,b)由四位专家进行的共识分析以检测词干类别,c)使用SMASH进行交叉映射。SMASH方法通过莱文斯坦距离(LD)、贾罗-温克勒距离和交叉映射的截止值计算单词的字符串比较、相似度和距离:映射[0.80 - 1.00]且LD < 13,部分映射[0.50 - 0.79]且LD < 13,无映射[0.0 - 0.49]且LD > 13。
在研究期间,评估了491份患者记录。护士记录了548项不同的非结构化自由文本护理活动。451项非结构化自由文本护理活动(82.3%)被映射到标准PAI护理活动,47项(8.7%)被部分映射,而50项(9.0%)未被映射。这种自动映射的召回率为0.95%,精确率为0.94%,准确率为0.91%,F值为0.96。F值表明这种自动程序在交叉映射中的可靠性良好。
发现非结构化自由文本护理活动与标准护理活动之间存在词汇相似性,使用SMASH方法的NLP是提取未通过结构化数据录入记录的与护理概念相关数据的可行方法。