Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China.
Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China.
Viruses. 2024 Oct 17;16(10):1624. doi: 10.3390/v16101624.
In the clinical diagnosis of pneumonia, particularly during the COVID-19 pandemic, individuals who progress to a critical stage requiring mechanical ventilation are classified as mechanically ventilated critically ill patients. Accurately predicting the discharge outcomes for this specific cohort, especially those with COVID-19, is of paramount clinical importance. Missing data, a common issue in medical research, can significantly impact the validity of analyses. In this work, we address this challenge by employing two missing data imputation techniques: multiple imputation and missForest, to enhance data completeness. Additionally, we utilize the smoothly clipped absolute deviation (SCAD) penalized logistic regression method to select significant features. Our real data analysis compares the predictive performances of extreme learning machines, random forests, support vector machines, and XGBoost using 10-fold cross-validation. The results consistently show that XGBoost outperforms the other methods in predicting discharge outcomes, making it a reliable tool for clinical decision-making in the treatment of severe pneumonia, including COVID-19 cases. Within this context, the random forest imputation method generally enhances performance, underscoring its effectiveness in managing missing data compared to multiple imputation.
在肺炎的临床诊断中,特别是在 COVID-19 大流行期间,进展为需要机械通气的危急阶段的患者被归类为需要机械通气的危重症患者。准确预测这一特定人群(特别是 COVID-19 患者)的出院结局具有至关重要的临床意义。在医学研究中,缺失数据是一个常见问题,会对分析的有效性产生重大影响。在这项工作中,我们通过采用两种缺失数据插补技术:多重插补和 missForest,来增强数据的完整性。此外,我们还利用平滑修剪绝对离差(SCAD)惩罚逻辑回归方法来选择显著特征。我们的真实数据分析比较了极端学习机、随机森林、支持向量机和 XGBoost 在使用 10 折交叉验证时的预测性能。结果一致表明,XGBoost 在预测出院结局方面优于其他方法,使其成为治疗严重肺炎(包括 COVID-19 病例)的临床决策的可靠工具。在这种情况下,随机森林插补方法通常可以提高性能,这表明它在处理缺失数据方面比多重插补更有效。