Department of Computer Science, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Trondheim, Norway.
Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway.
J Am Med Inform Assoc. 2022 Jan 29;29(3):559-575. doi: 10.1093/jamia/ocab236.
To determine the effects of using unstructured clinical text in machine learning (ML) for prediction, early detection, and identification of sepsis.
PubMed, Scopus, ACM DL, dblp, and IEEE Xplore databases were searched. Articles utilizing clinical text for ML or natural language processing (NLP) to detect, identify, recognize, diagnose, or predict the onset, development, progress, or prognosis of systemic inflammatory response syndrome, sepsis, severe sepsis, or septic shock were included. Sepsis definition, dataset, types of data, ML models, NLP techniques, and evaluation metrics were extracted.
The clinical text used in models include narrative notes written by nurses, physicians, and specialists in varying situations. This is often combined with common structured data such as demographics, vital signs, laboratory data, and medications. Area under the receiver operating characteristic curve (AUC) comparison of ML methods showed that utilizing both text and structured data predicts sepsis earlier and more accurately than structured data alone. No meta-analysis was performed because of incomparable measurements among the 9 included studies.
Studies focused on sepsis identification or early detection before onset; no studies used patient histories beyond the current episode of care to predict sepsis. Sepsis definition affects reporting methods, outcomes, and results. Many methods rely on continuous vital sign measurements in intensive care, making them not easily transferable to general ward units.
Approaches were heterogeneous, but studies showed that utilizing both unstructured text and structured data in ML can improve identification and early detection of sepsis.
确定在机器学习 (ML) 中使用非结构化临床文本进行预测、早期检测和识别脓毒症的效果。
检索了 PubMed、Scopus、ACM DL、dblp 和 IEEE Xplore 数据库。纳入使用临床文本进行 ML 或自然语言处理 (NLP) 以检测、识别、识别、诊断或预测全身炎症反应综合征、脓毒症、严重脓毒症或脓毒性休克发作、发展、进展或预后的文章。提取了脓毒症定义、数据集、数据类型、ML 模型、NLP 技术和评估指标。
模型中使用的临床文本包括护士、医生和专家在各种情况下撰写的叙述性笔记。这通常与常见的结构化数据(如人口统计学、生命体征、实验室数据和药物)结合使用。ML 方法的受试者工作特征曲线下面积 (AUC) 比较表明,同时使用文本和结构化数据比仅使用结构化数据更早、更准确地预测脓毒症。由于 9 项纳入研究的测量方法不可比,因此未进行荟萃分析。
研究重点是发病前的脓毒症识别或早期检测;没有研究使用当前护理期之外的患者病史来预测脓毒症。脓毒症的定义会影响报告方法、结果和结果。许多方法依赖于重症监护中的连续生命体征测量,因此不易转移到普通病房。
方法存在异质性,但研究表明,在 ML 中同时使用非结构化文本和结构化数据可以提高脓毒症的识别和早期检测。