Hou Jintong, Haslund-Gourley Benjamin, Diray-Arce Joann, Hoch Annmarie, Rouphael Nadine, Becker Patrice M, Augustine Alison D, Ozonoff Al, Guan Leying, Kleinstein Steven H, Peters Bjoern, Reed Elaine, Altman Matt, Langelier Charles R, Maecker Holden, Kim Seunghee, Montgomery Ruth R, Krammer Florian, Wilson Michael, Eckalbar Walter, Bosinger Steven E, Levy Ofer, Steen Hanno, Rosen Lindsey B, Baden Lindsey R, Melamed Esther, Ehrlich Lauren I R, McComsey Grace A, Sekaly Rafick P, Schaenman Joanna, Shaw Albert C, Hafler David A, Corry David B, Kheradmand Farrah, Atkinson Mark A, Brakenridge Scott C, Agudelo Higuita Nelson I, Metcalf Jordan P, Hough Catherine L, Messer William B, Pulendran Bali, Nadeau Kari C, Davis Mark M, Fernandez Sesma Ana, Simon Viviana, Kraft Monica, Bime Chris, Calfee Carolyn S, Erle David J, Robinson Lucy F, Cairns Charles B, Haddad Elias K, Comunale Mary Ann
Department of Microbiology and Immunology/Department of Medicine/Department of Epidemiology & Biostatistics, Drexel University, Philadelphia, PA, United States.
Clinical and Data Coordinating Center (CDCC) Precision Vaccines Program, Boston Children's Hospital, Boston, MA, United States.
Front Med (Lausanne). 2025 Jul 4;12:1604388. doi: 10.3389/fmed.2025.1604388. eCollection 2025.
The coronavirus disease 2019 (COVID-19) pandemic threatened public health and placed a significant burden on medical resources. The Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study collected clinical, demographic, blood cytometry, serum receptor-binding domain (RBD) antibody titers, metabolomics, targeted proteomics, nasal metagenomics, Olink, nasal viral load, autoantibody, SARS-CoV-2 antibody titers, and nasal and peripheral blood mononuclear cell (PBMC) transcriptomics data from patients hospitalized with COVID-19. The aim of this study is to select baseline biomarkers and build predictive models for 28-day in-hospital COVID-19 severity and mortality with most predictive variables while prioritizing routinely collected variables.
We analyzed 1102 hospitalized COVID-19 participants. We used the lasso and forward selection to select top predictors for severity and mortality, and built predictive models based on balanced training data. We then validated the models on testing data.
Severity was best predicted by the baseline SpO/FiO ratio obtained from COVID-19 patients (test AUC: 0.874). Adding patient age, BMI, FGF23, IL-6, and LTA to the disease severity prediction model improves the test AUC by an additional 3%. The clinical mortality prediction model using SpO/FiO ratio, age, and BMI resulted in a test AUC of 0.83. Adding laboratory results such as TNFRSF11B and plasma ribitol count increased the prediction model by 3.5%. The severity and mortality prediction models developed outperform the Sequential Organ Failure Assessment (SOFA) score among inpatients and perform similarly to the SOFA score among ICU patients.
This study identifies clinical data and laboratory biomarkers of COVID-19 severity and mortality using machine learning models. The study identifies SpO/FiO ratio to be the most important predictor for both severity and mortality. Several biomarkers were identified to modestly improve the predictions. The results also provide a baseline of SARS-CoV-2 infection during the early stages of the coronavirus emergence and can serve as a baseline for future studies that inform how the genetic evolution of the coronavirus affects the host response to new variants.
2019年冠状病毒病(COVID-19)大流行威胁着公众健康,并给医疗资源带来了巨大负担。COVID-19队列中的免疫表型评估(IMPACC)研究收集了COVID-19住院患者的临床、人口统计学、血细胞计数、血清受体结合域(RBD)抗体滴度、代谢组学、靶向蛋白质组学、鼻腔宏基因组学、Olink、鼻腔病毒载量、自身抗体、SARS-CoV-2抗体滴度以及鼻腔和外周血单核细胞(PBMC)转录组学数据。本研究的目的是选择基线生物标志物,并构建预测模型,以预测COVID-19患者28天住院期间的严重程度和死亡率,同时优先考虑常规收集的变量,并纳入最具预测性的变量。
我们分析了1102名COVID-19住院参与者。我们使用套索法和向前选择法来选择严重程度和死亡率的顶级预测因子,并基于平衡的训练数据构建预测模型。然后我们在测试数据上验证了这些模型。
COVID-19患者的基线SpO₂/FiO₂比值对严重程度的预测效果最佳(测试AUC:0.874)。将患者年龄、BMI、FGF23、IL-6和LTA添加到疾病严重程度预测模型中,可使测试AUC再提高3%。使用SpO₂/FiO₂比值、年龄和BMI的临床死亡率预测模型的测试AUC为0.83。添加TNFRSF11B和血浆核糖醇计数等实验室结果可使预测模型提高3.5%。所开发的严重程度和死亡率预测模型在住院患者中优于序贯器官衰竭评估(SOFA)评分,在ICU患者中与SOFA评分表现相似。
本研究使用机器学习模型确定了COVID-19严重程度和死亡率的临床数据和实验室生物标志物。该研究确定SpO₂/FiO₂比值是严重程度和死亡率的最重要预测因子。还确定了几种生物标志物可适度改善预测。研究结果还提供了冠状病毒出现早期SARS-CoV-2感染的基线,并可作为未来研究的基线,这些研究将揭示冠状病毒的基因进化如何影响宿主对新变种的反应。