Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania.
Department of Clinical Analytics, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.
JAMA Netw Open. 2023 Jul 3;6(7):e2322285. doi: 10.1001/jamanetworkopen.2023.22285.
Identifying patients at high risk of adverse outcomes prior to surgery may allow for interventions associated with improved postoperative outcomes; however, few tools exist for automated prediction.
To evaluate the accuracy of an automated machine-learning model in the identification of patients at high risk of adverse outcomes from surgery using only data in the electronic health record.
DESIGN, SETTING, AND PARTICIPANTS: This prognostic study was conducted among 1 477 561 patients undergoing surgery at 20 community and tertiary care hospitals in the University of Pittsburgh Medical Center (UPMC) health network. The study included 3 phases: (1) building and validating a model on a retrospective population, (2) testing model accuracy on a retrospective population, and (3) validating the model prospectively in clinical care. A gradient-boosted decision tree machine learning method was used for developing a preoperative surgical risk prediction tool. The Shapley additive explanations method was used for model interpretability and further validation. Accuracy was compared between the UPMC model and National Surgical Quality Improvement Program (NSQIP) surgical risk calculator for predicting mortality. Data were analyzed from September through December 2021.
Undergoing any type of surgical procedure.
Postoperative mortality and major adverse cardiac and cerebrovascular events (MACCEs) at 30 days were evaluated.
Among 1 477 561 patients included in model development (806 148 females [54.5%; mean [SD] age, 56.8 [17.9] years), 1 016 966 patient encounters were used for training and 254 242 separate encounters were used for testing the model. After deployment in clinical use, another 206 353 patients were prospectively evaluated; an additional 902 patients were selected for comparing the accuracy of the UPMC model and NSQIP tool for predicting mortality. The area under the receiver operating characteristic curve (AUROC) for mortality was 0.972 (95% CI, 0.971-0.973) for the training set and 0.946 (95% CI, 0.943-0.948) for the test set. The AUROC for MACCE and mortality was 0.923 (95% CI, 0.922-0.924) on the training and 0.899 (95% CI, 0.896-0.902) on the test set. In prospective evaluation, the AUROC for mortality was 0.956 (95% CI, 0.953-0.959), sensitivity was 2148 of 2517 patients (85.3%), specificity was 186 286 of 203 836 patients (91.4%), and negative predictive value was 186 286 of 186 655 patients (99.8%). The model outperformed the NSQIP tool as measured by AUROC (0.945 [95% CI, 0.914-0.977] vs 0.897 [95% CI, 0.854-0.941], for a difference of 0.048), specificity (0.87 [95% CI, 0.83-0.89] vs 0.68 [95% CI, 0.65-0.69]), and accuracy (0.85 [95% CI, 0.82-0.87] vs 0.69 [95% CI, 0.66, 0.72]).
This study found that an automated machine learning model was accurate in identifying patients undergoing surgery who were at high risk of adverse outcomes using only preoperative variables within the electronic health record, with superior performance compared with the NSQIP calculator. These findings suggest that using this model to identify patients at increased risk of adverse outcomes prior to surgery may allow for individualized perioperative care, which may be associated with improved outcomes.
在手术前识别出高风险不良结局的患者,可能有助于采取与改善术后结局相关的干预措施;然而,目前几乎没有工具可以进行自动化预测。
评估一种使用电子健康记录中的数据自动识别手术患者发生不良结局高风险的机器学习模型的准确性。
设计、地点和参与者:这项预后研究在匹兹堡大学医学中心(UPMC)医疗网络的 20 家社区和三级护理医院进行,共纳入了 1477561 名接受手术的患者。该研究包括 3 个阶段:(1)在回顾性人群中建立和验证模型,(2)在回顾性人群中测试模型准确性,(3)在临床护理中前瞻性验证模型。使用梯度提升决策树机器学习方法开发了术前手术风险预测工具。使用 Shapley 加性解释方法进行模型可解释性和进一步验证。比较了 UPMC 模型和国家手术质量改进计划(NSQIP)手术风险计算器预测死亡率的准确性。数据分析于 2021 年 9 月至 12 月进行。
接受任何类型的手术。
评估术后 30 天的死亡率和主要心脏和脑血管不良事件(MACCEs)。
在纳入模型开发的 1477561 名患者中(806148 名女性[54.5%];平均[标准差]年龄为 56.8[17.9]岁),1016966 次就诊用于训练模型,254242 次单独就诊用于测试模型。在临床应用中部署后,又前瞻性评估了 206353 名患者;另外选择了 902 名患者比较 UPMC 模型和 NSQIP 工具预测死亡率的准确性。训练集的死亡率的受试者工作特征曲线下面积(AUROC)为 0.972(95%CI,0.971-0.973),测试集为 0.946(95%CI,0.943-0.948)。训练集和测试集的 MACCE 和死亡率的 AUROC 分别为 0.923(95%CI,0.922-0.924)和 0.899(95%CI,0.896-0.902)。在前瞻性评估中,死亡率的 AUROC 为 0.956(95%CI,0.953-0.959),敏感性为 2517 名患者中的 2148 名(85.3%),特异性为 203836 名患者中的 186286 名(91.4%),阴性预测值为 186655 名患者中的 186286 名(99.8%)。该模型的表现优于 NSQIP 工具,表现在 AUROC(0.945 [95%CI,0.914-0.977]与 0.897 [95%CI,0.854-0.941],差异为 0.048)、特异性(0.87 [95%CI,0.83-0.89]与 0.68 [95%CI,0.65-0.69])和准确性(0.85 [95%CI,0.82-0.87]与 0.69 [95%CI,0.66-0.72])。
这项研究发现,一种使用电子健康记录中的术前变量自动识别手术患者高风险不良结局的机器学习模型准确性较高,与 NSQIP 计算器相比,具有更好的性能。这些发现表明,使用该模型在手术前识别出不良结局风险增加的患者,可能有助于实施个体化围手术期护理,这可能与改善结局相关。