与小儿逻辑器官功能障碍-2评分相比，机器学习分类器可改善死亡率预测：模型开发与验证

A Machine Learning Classifier Improves Mortality Prediction Compared With Pediatric Logistic Organ Dysfunction-2 Score: Model Development and Validation.

作者信息

Prince Remi D, Akhondi-Asl Alireza, Mehta Nilesh M, Geva Alon

机构信息

Tufts University School of Medicine, Boston, MA.

Division of Critical Care Medicine, Department of Anesthesiology, Critical Care, and Pain Medicine, Boston Children's Hospital, Boston, MA.

出版信息

Crit Care Explor. 2021 May 17;3(5):e0426. doi: 10.1097/CCE.0000000000000426. eCollection 2021 May.

DOI:10.1097/CCE.0000000000000426

PMID:34036277

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8133049/

Abstract

OBJECTIVES

To determine whether machine learning algorithms can better predict PICU mortality than the Pediatric Logistic Organ Dysfunction-2 score.

DESIGN

Retrospective study.

SETTING

Quaternary care medical-surgical PICU.

PATIENTS

All patients admitted to the PICU from 2013 to 2019.

INTERVENTIONS

None.

MEASUREMENTS AND MAIN RESULTS

We investigated the performance of various machine learning algorithms using the same variables used to calculate the Pediatric Logistic Organ Dysfunction-2 score to predict PICU mortality. We used 10,194 patient records from 2013 to 2017 for training and 4,043 patient records from 2018 to 2019 as a holdout validation cohort. Mortality rate was 3.0% in the training cohort and 3.4% in the validation cohort. The best performing algorithm was a random forest model (area under the receiver operating characteristic curve, 0.867 [95% CI, 0.863-0.895]; area under the precision-recall curve, 0.327 [95% CI, 0.246-0.414]; F1, 0.396 [95% CI, 0.321-0.468]) and significantly outperformed the Pediatric Logistic Organ Dysfunction-2 score (area under the receiver operating characteristic curve, 0.761 [95% CI, 0.713-0.810]; area under the precision-recall curve (0.239 [95% CI, 0.165-0.316]; F1, 0.284 [95% CI, 0.209-0.360]), although this difference was reduced after retraining the Pediatric Logistic Organ Dysfunction-2 logistic regression model at the study institution. The random forest model also showed better calibration than the Pediatric Logistic Organ Dysfunction-2 score, and calibration of the random forest model remained superior to the retrained Pediatric Logistic Organ Dysfunction-2 model.

CONCLUSIONS

A machine learning model achieved better performance than a logistic regression-based score for predicting ICU mortality. Better estimation of mortality risk can improve our ability to adjust for severity of illness in future studies, although external validation is required before this method can be widely deployed.

摘要

目的

确定机器学习算法在预测儿科重症监护病房（PICU）死亡率方面是否比儿科逻辑器官功能障碍-2评分表现更好。

设计

回顾性研究。

地点

四级医疗外科PICU。

患者

2013年至2019年入住PICU的所有患者。

干预措施

无。

测量指标及主要结果

我们使用与计算儿科逻辑器官功能障碍-2评分相同的变量，研究了各种机器学习算法预测PICU死亡率的性能。我们将2013年至2017年的10194份患者记录用于训练，并将2018年至2019年的4043份患者记录作为保留验证队列。训练队列中的死亡率为3.0%，验证队列中的死亡率为3.4%。表现最佳的算法是随机森林模型（受试者操作特征曲线下面积，0.867[95%CI，0.863 - 0.895]；精确召回率曲线下面积，0.327[95%CI，0.246 - 0.414]；F1值，0.396[95%CI，0.321 - 0.468]），其表现显著优于儿科逻辑器官功能障碍-2评分（受试者操作特征曲线下面积，0.761[95%CI，0.713 - 0.810]；精确召回率曲线下面积，0.239[95%CI，0.165 - 0.316]；F1值，0.284[95%CI，0.209 - 0.360]），不过在研究机构对儿科逻辑器官功能障碍-2逻辑回归模型进行重新训练后，这种差异有所减小。随机森林模型的校准也比儿科逻辑器官功能障碍-2评分更好，且随机森林模型的校准仍优于重新训练后的儿科逻辑器官功能障碍-2模型。

结论

在预测ICU死亡率方面，机器学习模型比基于逻辑回归的评分表现更好。更好地估计死亡风险可以提高我们在未来研究中针对疾病严重程度进行调整的能力，不过在该方法能够广泛应用之前，还需要进行外部验证。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

与小儿逻辑器官功能障碍-2评分相比，机器学习分类器可改善死亡率预测：模型开发与验证

A Machine Learning Classifier Improves Mortality Prediction Compared With Pediatric Logistic Organ Dysfunction-2 Score: Model Development and Validation.

作者信息

机构信息

出版信息

OBJECTIVES

DESIGN

SETTING

PATIENTS

INTERVENTIONS

MEASUREMENTS AND MAIN RESULTS

CONCLUSIONS

目的

设计

地点

患者

干预措施

测量指标及主要结果

结论

相似文献

引用本文的文献

本文引用的文献

与小儿逻辑器官功能障碍-2评分相比，机器学习分类器可改善死亡率预测：模型开发与验证

A Machine Learning Classifier Improves Mortality Prediction Compared With Pediatric Logistic Organ Dysfunction-2 Score: Model Development and Validation.

作者信息

机构信息

出版信息

OBJECTIVES

DESIGN

SETTING

PATIENTS

INTERVENTIONS

MEASUREMENTS AND MAIN RESULTS

CONCLUSIONS

目的

设计

地点

患者

干预措施

测量指标及主要结果

结论

相似文献

引用本文的文献

本文引用的文献