Suppr超能文献

与回归法相比,使用机器学习算法的美国外科医师学会NSQIP风险计算器的准确性

American College of Surgeons NSQIP Risk Calculator Accuracy Using a Machine Learning Algorithm Compared with Regression.

作者信息

Liu Yaoming, Ko Clifford Y, Hall Bruce L, Cohen Mark E

机构信息

From the Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, IL (Liu, Ko, Hall, Cohen).

the Department of Surgery, University of California Los Angeles David Geffen School of Medicine and the VA Greater Los Angeles Healthcare System, Los Angeles, CA (Ko).

出版信息

J Am Coll Surg. 2023 May 1;236(5):1024-1030. doi: 10.1097/XCS.0000000000000556. Epub 2023 Jan 12.

Abstract

BACKGROUND

The American College of Surgeons NSQIP risk calculator (RC) uses regression to make predictions for fourteen 30-day surgical outcomes. While this approach provides accurate (discrimination and calibration) risk estimates, they might be improved by machine learning (ML). To investigate this possibility, accuracy for regression-based risk estimates were compared to estimates from an extreme gradient boosting (XGB)-ML algorithm.

STUDY DESIGN

A cohort of 5,020,713 million NSQIP patient records was randomly divided into 80% for model construction and 20% for validation. Risk predictions using regression and XGB-ML were made for 13 RC binary 30-day surgical complications and one continuous outcome (length of stay [LOS]). For the binary outcomes, discrimination was evaluated using the area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC), and calibration was evaluated using Hosmer-Lemeshow statistics. Mean squared error and a calibration curve analog were evaluated for the continuous LOS outcome.

RESULTS

For every binary outcome, discrimination (AUROC and AUPRC) was slightly greater for XGB-ML than for regression (mean [across the outcomes] AUROC was 0.8299 vs 0.8251, and mean AUPRC was 0.1558 vs 0.1476, for XGB-ML and regression, respectively). For each outcome, miscalibration was greater (larger Hosmer-Lemeshow values) with regression; there was statistically significant miscalibration for all regression-based estimates, but only for 4 of 13 when XGB-ML was used. For LOS, mean squared error was lower for XGB-ML.

CONCLUSIONS

XGB-ML provided more accurate risk estimates than regression in terms of discrimination and calibration. Differences in calibration between regression and XGB-ML were of substantial magnitude and support transitioning the RC to XGB-ML.

摘要

背景

美国外科医师学会国家外科质量改进计划风险计算器(RC)采用回归分析对14种30天手术结局进行预测。虽然这种方法能提供准确的(区分度和校准度)风险估计,但机器学习(ML)可能会对其有所改进。为了探究这种可能性,将基于回归的风险估计的准确性与极端梯度提升(XGB)-ML算法的估计进行了比较。

研究设计

一组5020713万份NSQIP患者记录被随机分为80%用于模型构建,20%用于验证。使用回归分析和XGB-ML对13种RC二元30天手术并发症和一个连续结局(住院时间[LOS])进行风险预测。对于二元结局,使用受试者工作特征曲线下面积(AUROC)和精确召回率曲线下面积(AUPRC)评估区分度,使用Hosmer-Lemeshow统计量评估校准度。对连续的LOS结局评估均方误差和校准曲线类似物。

结果

对于每个二元结局,XGB-ML的区分度(AUROC和AUPRC)略高于回归分析(XGB-ML和回归分析的平均[所有结局]AUROC分别为0.8299和0.8251,平均AUPRC分别为0.1558和0.1476)。对于每个结局,回归分析的校准错误更大(Hosmer-Lemeshow值更大);所有基于回归的估计均存在统计学上显著的校准错误,但使用XGB-ML时13个结局中只有4个存在校准错误。对于LOS,XGB-ML的均方误差更低。

结论

在区分度和校准度方面,XGB-ML比回归分析提供了更准确的风险估计。回归分析和XGB-ML在校准度上的差异很大,支持将RC转换为XGB-ML。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验