使用深度神经网络和自然语言处理预测术后死亡率:模型开发与验证
Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation.
作者信息
Chen Pei-Fu, Chen Lichin, Lin Yow-Kuan, Li Guo-Hung, Lai Feipei, Lu Cheng-Wei, Yang Chi-Yu, Chen Kuan-Chih, Lin Tzu-Yu
机构信息
Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan.
Department of Anesthesiology, Far Eastern Memorial Hospital, New Taipei City, Taiwan.
出版信息
JMIR Med Inform. 2022 May 10;10(5):e38241. doi: 10.2196/38241.
BACKGROUND
Machine learning (ML) achieves better predictions of postoperative mortality than previous prediction tools. Free-text descriptions of the preoperative diagnosis and the planned procedure are available preoperatively. Because reading these descriptions helps anesthesiologists evaluate the risk of the surgery, we hypothesized that deep learning (DL) models with unstructured text could improve postoperative mortality prediction. However, it is challenging to extract meaningful concept embeddings from this unstructured clinical text.
OBJECTIVE
This study aims to develop a fusion DL model containing structured and unstructured features to predict the in-hospital 30-day postoperative mortality before surgery. ML models for predicting postoperative mortality using preoperative data with or without free clinical text were assessed.
METHODS
We retrospectively collected preoperative anesthesia assessments, surgical information, and discharge summaries of patients undergoing general and neuraxial anesthesia from electronic health records (EHRs) from 2016 to 2020. We first compared the deep neural network (DNN) with other models using the same input features to demonstrate effectiveness. Then, we combined the DNN model with bidirectional encoder representations from transformers (BERT) to extract information from clinical texts. The effects of adding text information on the model performance were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). Statistical significance was evaluated using P<.05.
RESULTS
The final cohort contained 121,313 patients who underwent surgeries. A total of 1562 (1.29%) patients died within 30 days of surgery. Our BERT-DNN model achieved the highest AUROC (0.964, 95% CI 0.961-0.967) and AUPRC (0.336, 95% CI 0.276-0.402). The AUROC of the BERT-DNN was significantly higher compared to logistic regression (AUROC=0.952, 95% CI 0.949-0.955) and the American Society of Anesthesiologist Physical Status (ASAPS AUROC=0.892, 95% CI 0.887-0.896) but not significantly higher compared to the DNN (AUROC=0.959, 95% CI 0.956-0.962) and the random forest (AUROC=0.961, 95% CI 0.958-0.964). The AUPRC of the BERT-DNN was significantly higher compared to the DNN (AUPRC=0.319, 95% CI 0.260-0.384), the random forest (AUPRC=0.296, 95% CI 0.239-0.360), logistic regression (AUPRC=0.276, 95% CI 0.220-0.339), and the ASAPS (AUPRC=0.149, 95% CI 0.107-0.203).
CONCLUSIONS
Our BERT-DNN model has an AUPRC significantly higher compared to previously proposed models using no text and an AUROC significantly higher compared to logistic regression and the ASAPS. This technique helps identify patients with higher risk from the surgical description text in EHRs.
背景
与先前的预测工具相比,机器学习(ML)在预测术后死亡率方面表现更优。术前可获取术前诊断和计划手术的自由文本描述。由于阅读这些描述有助于麻醉医生评估手术风险,我们推测具有非结构化文本的深度学习(DL)模型可改善术后死亡率预测。然而,从这种非结构化临床文本中提取有意义的概念嵌入具有挑战性。
目的
本研究旨在开发一种包含结构化和非结构化特征的融合DL模型,以在手术前预测术后30天内的院内死亡率。评估了使用术前数据(有或无自由临床文本)预测术后死亡率的ML模型。
方法
我们回顾性收集了2016年至2020年电子健康记录(EHR)中接受全身麻醉和神经轴索麻醉患者的术前麻醉评估、手术信息和出院小结。我们首先使用相同输入特征将深度神经网络(DNN)与其他模型进行比较以证明有效性。然后,我们将DNN模型与来自变换器的双向编码器表示(BERT)相结合,从临床文本中提取信息。使用受试者工作特征曲线下面积(AUROC)和精确召回率曲线下面积(AUPRC)比较添加文本信息对模型性能的影响。使用P<0.05评估统计学显著性。
结果
最终队列包含121313例接受手术的患者。共有1562例(1.29%)患者在术后30天内死亡。我们的BERT-DNN模型实现了最高的AUROC(0.964,95%CI 0.961-0.967)和AUPRC(0.336,95%CI 0.276-0.402)。与逻辑回归(AUROC=0.952,95%CI 0.949-0.955)和美国麻醉医师协会身体状况分级(ASAPS AUROC=0.892,95%CI 0.887-0.896)相比,BERT-DNN的AUROC显著更高,但与DNN(AUROC=0.959,95%CI 0.956-0.962)和随机森林(AUROC=0.961,95%CI 0.958-0.964)相比,差异不显著。与DNN(AUPRC=0.319,95%CI 0.260-0.384)、随机森林(AUPRC=0.296,95%CI 0.239-0.360)、逻辑回归(AUPRC=0.276,95%CI 0.220-0.339)和ASAPS(AUPRC=0.149,95%CI 0.107-0.203)相比,BERT-DNN的AUPRC显著更高。
结论
与先前提出的不使用文本的模型相比,我们的BERT-DNN模型的AUPRC显著更高,与逻辑回归和ASAPS相比,AUROC显著更高。该技术有助于从EHR中的手术描述文本中识别出高风险患者。