Kulshrestha Sujay, Dligach Dmitriy, Joyce Cara, Gonzalez Richard, O'Rourke Ann P, Glazer Joshua M, Stey Anne, Kruser Jacqueline M, Churpek Matthew M, Afshar Majid
Burn and Shock Trauma Research Institute, Loyola University Chicago, Maywood, Illinois, USA.
Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA.
JAMIA Open. 2021 Mar 1;4(1):ooab015. doi: 10.1093/jamiaopen/ooab015. eCollection 2021 Jan.
Trauma quality improvement programs and registries improve care and outcomes for injured patients. Designated trauma centers calculate injury scores using dedicated trauma registrars; however, many injuries arrive at nontrauma centers, leaving a substantial amount of data uncaptured. We propose automated methods to identify severe chest injury using machine learning (ML) and natural language processing (NLP) methods from the electronic health record (EHR) for quality reporting.
A level I trauma center was queried for patients presenting after injury between 2014 and 2018. Prediction modeling was performed to classify severe chest injury using a reference dataset labeled by certified registrars. Clinical documents from trauma encounters were processed into concept unique identifiers for inputs to ML models: logistic regression with elastic net (EN) regularization, extreme gradient boosted (XGB) machines, and convolutional neural networks (CNN). The optimal model was identified by examining predictive and face validity metrics using global explanations.
Of 8952 encounters, 542 (6.1%) had a severe chest injury. CNN and EN had the highest discrimination, with an area under the receiver operating characteristic curve of 0.93 and calibration slopes between 0.88 and 0.97. CNN had better performance across risk thresholds with fewer discordant cases. Examination of global explanations demonstrated the CNN model had better face validity, with top features including "contusion of lung" and "hemopneumothorax."
The CNN model featured optimal discrimination, calibration, and clinically relevant features selected.
NLP and ML methods to populate trauma registries for quality analyses are feasible.
创伤质量改进项目和登记系统可改善受伤患者的护理及预后。指定的创伤中心由专门的创伤登记员计算损伤评分;然而,许多受伤患者是在非创伤中心就诊,导致大量数据未被收集。我们提出使用机器学习(ML)和自然语言处理(NLP)方法从电子健康记录(EHR)中自动识别严重胸部损伤,以进行质量报告。
查询了一家一级创伤中心2014年至2018年期间受伤后就诊的患者。使用由认证登记员标注的参考数据集进行预测建模,以对严重胸部损伤进行分类。将创伤诊疗过程中的临床文档处理为概念唯一标识符,作为ML模型的输入:带弹性网络(EN)正则化的逻辑回归、极端梯度提升(XGB)机器和卷积神经网络(CNN)。通过使用全局解释检查预测和表面效度指标来确定最佳模型。
在8952次诊疗中,542例(6.1%)有严重胸部损伤。CNN和EN具有最高的辨别力,受试者操作特征曲线下面积为0.93,校准斜率在0.88至0.97之间。在不同风险阈值下,CNN的表现更好,不一致病例更少。对全局解释的检查表明,CNN模型具有更好的表面效度,其主要特征包括“肺挫伤”和“血气胸”。
CNN模型具有最佳的辨别力、校准能力,并选择了与临床相关的特征。
使用NLP和ML方法填充创伤登记系统以进行质量分析是可行的。