Department Public Health, School of Public Health, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia.
Department of Public Health, College of Medicine and Health Science, Debre Markos University, Gojjam, Ethiopia.
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
Coronavirus disease 2019 (COVID-19), a global public health crisis, continues to pose challenges despite preventive measures. The daily rise in COVID-19 cases is concerning, and the testing process is both time-consuming and costly. While several models have been created to predict mortality in COVID-19 patients, only a few have shown sufficient accuracy. Machine learning algorithms offer a promising approach to data-driven prediction of clinical outcomes, surpassing traditional statistical modeling. Leveraging machine learning (ML) algorithms could potentially provide a solution for predicting mortality in hospitalized COVID-19 patients in Ethiopia. Therefore, the aim of this study is to develop and validate machine-learning models for accurately predicting mortality in COVID-19 hospitalized patients in Ethiopia.
Our study involved analyzing electronic medical records of COVID-19 patients who were admitted to public hospitals in Ethiopia. Specifically, we developed seven different machine learning models to predict COVID-19 patient mortality. These models included J48 decision tree, random forest (RF), k-nearest neighborhood (k-NN), multi-layer perceptron (MLP), Naïve Bayes (NB), eXtreme gradient boosting (XGBoost), and logistic regression (LR). We then compared the performance of these models using data from a cohort of 696 patients through statistical analysis. To evaluate the effectiveness of the models, we utilized metrics derived from the confusion matrix such as sensitivity, specificity, precision, and receiver operating characteristic (ROC).
The study included a total of 696 patients, with a higher number of females (440 patients, accounting for 63.2%) compared to males. The median age of the participants was 35.0 years old, with an interquartile range of 18-79. After conducting different feature selection procedures, 23 features were examined, and identified as predictors of mortality, and it was determined that gender, Intensive care unit (ICU) admission, and alcohol drinking/addiction were the top three predictors of COVID-19 mortality. On the other hand, loss of smell, loss of taste, and hypertension were identified as the three lowest predictors of COVID-19 mortality. The experimental results revealed that the k-nearest neighbor (k-NN) algorithm outperformed than other machine learning algorithms, achieving an accuracy of 95.25%, sensitivity of 95.30%, precision of 92.7%, specificity of 93.30%, F1 score 93.98% and a receiver operating characteristic (ROC) score of 96.90%. These findings highlight the effectiveness of the k-NN algorithm in predicting COVID-19 outcomes based on the selected features.
Our study has developed an innovative model that utilizes hospital data to accurately predict the mortality risk of COVID-19 patients. The main objective of this model is to prioritize early treatment for high-risk patients and optimize strained healthcare systems during the ongoing pandemic. By integrating machine learning with comprehensive hospital databases, our model effectively classifies patients' mortality risk, enabling targeted medical interventions and improved resource management. Among the various methods tested, the K-nearest neighbors (KNN) algorithm demonstrated the highest accuracy, allowing for early identification of high-risk patients. Through KNN feature identification, we identified 23 predictors that significantly contribute to predicting COVID-19 mortality. The top five predictors are gender (female), intensive care unit (ICU) admission, alcohol drinking, smoking, and symptoms of headache and chills. This advancement holds great promise in enhancing healthcare outcomes and decision-making during the pandemic. By providing services and prioritizing patients based on the identified predictors, healthcare facilities and providers can improve the chances of survival for individuals. This model provides valuable insights that can guide healthcare professionals in allocating resources and delivering appropriate care to those at highest risk.
尽管采取了预防措施,但 2019 年冠状病毒病(COVID-19)这一全球公共卫生危机仍在持续带来挑战。COVID-19 病例的日增数量令人担忧,检测过程既耗时又昂贵。虽然已经创建了几个模型来预测 COVID-19 患者的死亡率,但只有少数模型显示出足够的准确性。机器学习算法提供了一种有前途的数据驱动预测临床结果的方法,超越了传统的统计建模。利用机器学习(ML)算法有可能为预测埃塞俄比亚住院 COVID-19 患者的死亡率提供解决方案。因此,本研究的目的是开发和验证机器学习模型,以准确预测埃塞俄比亚住院 COVID-19 患者的死亡率。
我们的研究涉及分析在埃塞俄比亚公立医院住院的 COVID-19 患者的电子病历。具体来说,我们开发了七个不同的机器学习模型来预测 COVID-19 患者的死亡率。这些模型包括 J48 决策树、随机森林(RF)、k-最近邻(k-NN)、多层感知器(MLP)、朴素贝叶斯(NB)、极端梯度提升(XGBoost)和逻辑回归(LR)。然后,我们通过对来自 696 名患者队列的数据进行统计分析来比较这些模型的性能。为了评估模型的有效性,我们使用来自混淆矩阵的指标,如敏感性、特异性、精度和接收器操作特征(ROC)。
该研究共纳入 696 名患者,女性患者(440 名,占 63.2%)多于男性患者。参与者的中位年龄为 35.0 岁,四分位距为 18-79 岁。在进行不同的特征选择程序后,检查了 23 个特征,并确定其为死亡率的预测因子,并且确定性别、重症监护病房(ICU)入院和饮酒/成瘾是 COVID-19 死亡率的前三个预测因子。另一方面,嗅觉丧失、味觉丧失和高血压被确定为 COVID-19 死亡率的三个最低预测因子。实验结果表明,k-最近邻(k-NN)算法的性能优于其他机器学习算法,准确率为 95.25%,敏感性为 95.30%,精度为 92.7%,特异性为 93.30%,F1 得分为 93.98%,接收器操作特征(ROC)得分为 96.90%。这些发现突出了 k-NN 算法在基于所选特征预测 COVID-19 结果方面的有效性。
我们的研究开发了一种创新模型,该模型利用医院数据准确预测 COVID-19 患者的死亡风险。该模型的主要目的是为高危患者提供早期治疗,并在持续的大流行期间优化紧张的医疗保健系统。通过将机器学习与综合医院数据库相结合,我们的模型有效地对患者的死亡率风险进行分类,从而实现有针对性的医疗干预和改善资源管理。在测试的各种方法中,K-最近邻(KNN)算法表现出最高的准确性,能够早期识别高危患者。通过 KNN 特征识别,我们确定了 23 个对预测 COVID-19 死亡率有显著贡献的预测因子。前五个预测因子是性别(女性)、重症监护病房(ICU)入院、饮酒、吸烟以及头痛和寒战症状。这一进展在大流行期间增强医疗保健结果和决策制定方面具有巨大的潜力。通过根据确定的预测因子提供服务和优先考虑患者,医疗保健设施和提供者可以提高个体的生存机会。该模型提供了有价值的见解,可以指导医疗保健专业人员分配资源并为高风险人群提供适当的护理。