Department of Radiology, Children's Hospital of Soochow University, Suzhou, 215025, China.
BMC Pediatr. 2024 Nov 21;24(1):760. doi: 10.1186/s12887-024-05249-1.
Pulmonary hemorrhage (PH) in respiratory distress syndrome (RDS) in extremely preterm infants exhibits a high mortality rate and poor long-term outcomes. The aim of the present study was to develop a machine learning (ML) predictive model for RDS with PH in extremely preterm infants.
We performed a retrospective analysis of extremely preterm infants with RDS at the Children's Hospital of Soochow University between January 2015 and January 2021. We applied three ML algorithms-logistic regression (LR), random forest (RF), and extreme gradient boosting (XGBoost)-to evaluate the performance of each model using the area under the curve (AUC), and developed a predictive model based on the optimal model. We calculated SHapley Additive exPlanations (SHAP) values to determine variables importance and show visualization results, and constructed a nomogram for individualized prediction.
A total of 309 patients with RDS were enrolled, including 48 (15.5%) with PH. A total of 29 variables were collected, including demographic and clinical characteristics, laboratory data, and image classification. According to the AUC values, the RF model performed best (AUC = 0.868). Based on the SHAP values, the top five important variables in the RF model were gestational age, PaO/FiO, birth weight, mean platelet volume, and Apgar score at 5 min.
Our study showed that the RF model could be used to predict the risk of PH in RDS in extremely preterm infants. The nomogram provides clinicians with an effective tool for early warning and timely management.
极早产儿呼吸窘迫综合征(RDS)合并肺出血(PH)的病死率高,远期预后差。本研究旨在建立预测极早产儿 RDS 合并 PH 的机器学习(ML)预测模型。
回顾性分析 2015 年 1 月至 2021 年 1 月苏州大学附属儿童医院收治的极早产儿 RDS 患儿的临床资料,采用逻辑回归(LR)、随机森林(RF)和极端梯度提升(XGBoost)3 种 ML 算法,计算各模型的曲线下面积(AUC),并基于最优模型建立预测模型。通过 SHapley Additive exPlanations(SHAP)值计算变量重要性,并进行可视化结果展示,构建个体化预测列线图。
共纳入 309 例 RDS 患儿,其中 48 例(15.5%)合并 PH。共收集 29 个变量,包括人口统计学和临床特征、实验室数据和影像分类。根据 AUC 值,RF 模型的表现最佳(AUC=0.868)。根据 SHAP 值,RF 模型中最重要的前 5 个变量依次为胎龄、PaO/FiO、出生体重、平均血小板体积和 5 分钟 Apgar 评分。
本研究表明,RF 模型可用于预测极早产儿 RDS 合并 PH 的风险,列线图为临床医生提供了早期预警和及时管理的有效工具。