Department of Gastroenterology and Hepatology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
School of the First Clinical Medical Sciences, Wenzhou Medical University, Wenzhou, China.
Front Cell Infect Microbiol. 2022 Apr 12;12:819267. doi: 10.3389/fcimb.2022.819267. eCollection 2022.
The aim of this study was to apply machine learning models and a nomogram to differentiate critically ill from non-critically ill COVID-19 pneumonia patients.
Clinical symptoms and signs, laboratory parameters, cytokine profile, and immune cellular data of 63 COVID-19 pneumonia patients were retrospectively reviewed. Outcomes were followed up until Mar 12, 2020. A logistic regression function (LR model), Random Forest, and XGBoost models were developed. The performance of these models was measured by area under receiver operating characteristic curve (AUC) analysis.
Univariate analysis revealed that there was a difference between critically and non-critically ill patients with respect to levels of interleukin-6, interleukin-10, T cells, CD4 T, and CD8 T cells. Interleukin-10 with an AUC of 0.86 was most useful predictor of critically ill patients with COVID-19 pneumonia. Ten variables (respiratory rate, neutrophil counts, aspartate transaminase, albumin, serum procalcitonin, D-dimer and B-type natriuretic peptide, CD4 T cells, interleukin-6 and interleukin-10) were used as candidate predictors for LR model, Random Forest (RF) and XGBoost model application. The coefficients from LR model were utilized to build a nomogram. RF and XGBoost methods suggested that Interleukin-10 and interleukin-6 were the most important variables for severity of illness prediction. The mean AUC for LR, RF, and XGBoost model were 0.91, 0.89, and 0.93 respectively (in two-fold cross-validation). Individualized prediction by XGBoost model was explained by local interpretable model-agnostic explanations (LIME) plot.
XGBoost exhibited the highest discriminatory performance for prediction of critically ill patients with COVID-19 pneumonia. It is inferred that the nomogram and visualized interpretation with LIME plot could be useful in the clinical setting. Additionally, interleukin-10 could serve as a useful predictor of critically ill patients with COVID-19 pneumonia.
本研究旨在应用机器学习模型和诺模图区分新冠肺炎重症与非重症肺炎患者。
回顾性分析 63 例新冠肺炎肺炎患者的临床症状和体征、实验室参数、细胞因子谱和免疫细胞数据。结果随访至 2020 年 3 月 12 日。建立逻辑回归函数(LR 模型)、随机森林和 XGBoost 模型。通过接受者操作特征曲线(AUC)分析测量这些模型的性能。
单因素分析显示,新冠肺炎重症与非重症患者白细胞介素-6、白细胞介素-10、T 细胞、CD4 T 和 CD8 T 细胞水平存在差异。白细胞介素-10 的 AUC 为 0.86,是预测新冠肺炎重症患者最有用的指标。10 个变量(呼吸频率、中性粒细胞计数、天冬氨酸转氨酶、白蛋白、血清降钙素原、D-二聚体和 B 型利钠肽、CD4 T 细胞、白细胞介素-6 和白细胞介素-10)被用作 LR 模型、随机森林(RF)和 XGBoost 模型应用的候选预测因子。LR 模型的系数用于构建诺模图。RF 和 XGBoost 方法表明,白细胞介素-10 和白细胞介素-6 是预测疾病严重程度的最重要变量。LR、RF 和 XGBoost 模型的平均 AUC 分别为 0.91、0.89 和 0.93(在两折交叉验证中)。XGBoost 模型的个体化预测由局部可解释模型不可知解释(LIME)图解释。
XGBoost 对预测新冠肺炎重症肺炎患者具有最高的鉴别性能。可以推断,诺模图和 LIME 图的可视化解释在临床环境中可能有用。此外,白细胞介素-10 可作为新冠肺炎重症患者的有用预测因子。