Ehwerhemuepha Louis, Danioko Sidy, Verma Shiva, Marano Rachel, Feaster William, Taraman Sharief, Moreno Tatiana, Zheng Jianwei, Yaghmaei Ehsan, Chang Anthony
Children's Hospital of Orange County, Orange, CA, 92868, United States.
Schmid College of Science, Chapman University, Orange, CA, 92866, United States.
Intell Based Med. 2021;5:100030. doi: 10.1016/j.ibmed.2021.100030. Epub 2021 Mar 17.
Cardiovascular and other circulatory system diseases have been implicated in the severity of COVID-19 in adults. This study provides a super learner ensemble of models for predicting COVID-19 severity among these patients.
The COVID-19 Dataset of the Cerner Real-World Data was used for this study. Data on adult patients (18 years or older) with cardiovascular diseases between 2017 and 2019 were retrieved and a total of 13 of these conditions were identified. Among these patients, 33,042 admitted with positive diagnoses for COVID-19 between March 2020 and June 2020 (from 59 hospitals) were identified and selected for this study. A total of 14 statistical and machine learning models were developed and combined into a more powerful super learning model for predicting COVID-19 severity on admission to the hospital.
LASSO regression, a full extreme gradient boosting model with tree depth of 2, and a full logistic regression model were the most predictive with cross-validated AUROCs of 0.7964, 0.7961, and 0.7958 respectively. The resulting super learner ensemble model had a cross validated AUROC of 0.8006 (range: 0.7814, 0.8163). The unbiased AUROC of the super learner model on an independent test set was 0.8057 (95% CI: 0.7954, 0.8159).
Highly predictive models can be built to predict COVID-19 severity of patients with cardiovascular and other circulatory conditions. Super learning ensembles will improve individual and classical ensemble models significantly.
心血管及其他循环系统疾病与成人 COVID-19 的严重程度有关。本研究提供了一组超级学习模型,用于预测这些患者中 COVID-19 的严重程度。
本研究使用了 Cerner 真实世界数据中的 COVID-19 数据集。检索了 2017 年至 2019 年患有心血管疾病的成年患者(18 岁及以上)的数据,共确定了 13 种此类病症。在这些患者中,确定并选择了 2020 年 3 月至 2020 年 6 月期间因 COVID-19 阳性诊断入院的 33042 名患者(来自 59 家医院)进行本研究。共开发了 14 种统计和机器学习模型,并将其组合成一个更强大的超级学习模型,用于预测患者入院时 COVID-19 的严重程度。
套索回归、树深度为 2 的完整极端梯度提升模型和完整逻辑回归模型的预测能力最强,交叉验证的曲线下面积(AUROC)分别为 0.7964、0.7961 和 0.7958。所得的超级学习集成模型的交叉验证 AUROC 为 0.8006(范围:0.7814,0.8163)。超级学习模型在独立测试集上的无偏 AUROC 为 0.8057(95%置信区间:0.7954,0.8159)。
可以构建高度预测性的模型来预测患有心血管和其他循环系统疾病患者的 COVID-19 严重程度。超级学习集成将显著改进个体模型和经典集成模型。