Tulu Thomas Wetere, Wan Tsz Kin, Chan Ching Long, Wu Chun Hei, Woo Peter Yat Ming, Tseng Cee Zhung Steven, Vodencarevic Asmir, Menni Cristina, Chan Kei Hang Katie
Department of Biomedical Sciences, City University of Hong Kong, Hong Kong SAR, China.
Computational Data Science Program, Addis Ababa University, Addis Ababa, Ethiopia.
BMC Digit Health. 2023;1(1):6. doi: 10.1186/s44247-022-00001-0. Epub 2023 Feb 3.
COVID-19 has become a major global public health problem, despite prevention and efforts. The daily number of COVID-19 cases rapidly increases, and the time and financial costs associated with testing procedure are burdensome. To overcome this, we aim to identify immunological and metabolic biomarkers to predict COVID-19 mortality using a machine learning model. We included inpatients from Hong Kong's public hospitals between January 1, and September 30, 2020, who were diagnosed with COVID-19 using RT-PCR. We developed three machine learning models to predict the mortality of COVID-19 patients based on data in their electronic medical records. We performed statistical analysis to compare the trained machine learning models which are Deep Neural Networks (DNN), Random Forest Classifier (RF) and Support Vector Machine (SVM) using data from a cohort of 5,059 patients (median age = 46 years; 49.3% male) who had tested positive for COVID-19 based on electronic health records and data from 532,427 patients as controls. We identified top 20 immunological and metabolic biomarkers that can accurately predict the risk of mortality from COVID-19 with ROC-AUC of 0.98 (95% CI 0.96-0.98). Of the three models used, our result demonstrate that the random forest (RF) model achieved the most accurate prediction of mortality among COVID-19 patients with age, glomerular filtration, albumin, urea, procalcitonin, c-reactive protein, oxygen, bicarbonate, carbon dioxide, ferritin, glucose, erythrocytes, creatinine, lymphocytes, PH of blood and leukocytes among the most important biomarkers identified. A cohort from Kwong Wah Hospital (131 patients) was used for model validation with ROC-AUC of 0.90 (95% CI 0.84-0.92). We recommend physicians closely monitor hematological, coagulation, cardiac, hepatic, renal and inflammatory factors for potential progression to severe conditions among COVID-19 patients. To the best of our knowledge, no previous research has identified important immunological and metabolic biomarkers to the extent demonstrated in our study.
The online version contains supplementary material available at 10.1186/s44247-022-00001-0.
尽管采取了预防措施并做出了努力,但新冠病毒病(COVID-19)已成为一个重大的全球公共卫生问题。COVID-19的每日病例数迅速增加,与检测程序相关的时间和财务成本负担沉重。为了克服这一问题,我们旨在识别免疫和代谢生物标志物,以便使用机器学习模型预测COVID-19的死亡率。我们纳入了2020年1月1日至9月30日期间香港公立医院的住院患者,这些患者通过逆转录聚合酶链反应(RT-PCR)被诊断为COVID-19。我们基于电子病历数据开发了三种机器学习模型,以预测COVID-19患者的死亡率。我们进行了统计分析,使用来自5059名患者队列(中位年龄 = 46岁;49.3%为男性)的数据比较经过训练的机器学习模型,这些患者基于电子健康记录COVID-19检测呈阳性,以及532427名患者的数据作为对照。我们识别出前20种免疫和代谢生物标志物,它们能够准确预测COVID-19的死亡风险,受试者工作特征曲线下面积(ROC-AUC)为0.98(95%置信区间0.96 - 0.98)。在所使用的三种模型中,我们的结果表明,随机森林(RF)模型在COVID-19患者死亡率预测方面最为准确,年龄、肾小球滤过率、白蛋白、尿素、降钙素原、C反应蛋白、氧、碳酸氢根、二氧化碳、铁蛋白、葡萄糖、红细胞、肌酐、淋巴细胞、血液pH值和白细胞是所识别出的最重要的生物标志物。广华医院的一个队列(131名患者)用于模型验证,ROC-AUC为0.90(95%置信区间0.84 - 0.92)。我们建议医生密切监测COVID-19患者的血液学、凝血、心脏、肝脏、肾脏和炎症因子,以了解其是否有进展为重症的可能性。据我们所知,之前没有研究在我们研究所示的程度上识别出重要的免疫和代谢生物标志物。
在线版本包含可在10.1186/s44247-022-00001-0获取的补充材料。