Medical Informatics Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran.
Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran.
BMC Med Inform Decis Mak. 2022 Jan 4;22(1):2. doi: 10.1186/s12911-021-01742-0.
The coronavirus disease (COVID-19) hospitalized patients are always at risk of death. Machine learning (ML) algorithms can be used as a potential solution for predicting mortality in COVID-19 hospitalized patients. So, our study aimed to compare several ML algorithms to predict the COVID-19 mortality using the patient's data at the first time of admission and choose the best performing algorithm as a predictive tool for decision-making.
In this study, after feature selection, based on the confirmed predictors, information about 1500 eligible patients (1386 survivors and 144 deaths) obtained from the registry of Ayatollah Taleghani Hospital, Abadan city, Iran, was extracted. Afterwards, several ML algorithms were trained to predict COVID-19 mortality. Finally, to assess the models' performance, the metrics derived from the confusion matrix were calculated.
The study participants were 1500 patients; the number of men was found to be higher than that of women (836 vs. 664) and the median age was 57.25 years old (interquartile 18-100). After performing the feature selection, out of 38 features, dyspnea, ICU admission, and oxygen therapy were found as the top three predictors. Smoking, alanine aminotransferase, and platelet count were found to be the three lowest predictors of COVID-19 mortality. Experimental results demonstrated that random forest (RF) had better performance than other ML algorithms with accuracy, sensitivity, precision, specificity, and receiver operating characteristic (ROC) of 95.03%, 90.70%, 94.23%, 95.10%, and 99.02%, respectively.
It was found that ML enables a reasonable level of accuracy in predicting the COVID-19 mortality. Therefore, ML-based predictive models, particularly the RF algorithm, potentially facilitate identifying the patients who are at high risk of mortality and inform proper interventions by the clinicians.
患有冠状病毒病(COVID-19)的住院患者总是有死亡的风险。机器学习(ML)算法可作为预测 COVID-19 住院患者死亡率的潜在解决方案。因此,我们的研究旨在比较几种 ML 算法,使用患者入院时的第一时间数据来预测 COVID-19 死亡率,并选择表现最佳的算法作为决策的预测工具。
在这项研究中,经过特征选择后,基于确定的预测因素,从伊朗阿巴丹市阿亚图拉·塔莱加尼医院的登记处提取了 1500 名合格患者(1386 名幸存者和 144 名死亡者)的信息。然后,训练了几种 ML 算法来预测 COVID-19 死亡率。最后,为了评估模型的性能,计算了来自混淆矩阵的指标。
研究参与者为 1500 名患者;男性人数高于女性(836 比 664),中位年龄为 57.25 岁(四分位距 18-100)。经过特征选择后,在 38 个特征中,呼吸困难、入住 ICU 和氧疗被确定为前三个预测因素。吸烟、丙氨酸氨基转移酶和血小板计数被确定为 COVID-19 死亡率的三个最低预测因素。实验结果表明,随机森林(RF)的性能优于其他 ML 算法,准确率、灵敏度、精确度、特异性和接收者操作特征(ROC)分别为 95.03%、90.70%、94.23%、95.10%和 99.02%。
研究发现,ML 能够在预测 COVID-19 死亡率方面达到合理的准确度。因此,基于 ML 的预测模型,特别是 RF 算法,有可能帮助识别高死亡率的患者,并为临床医生提供适当的干预措施。