Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Av. Eugenio Garza Sada 2501 Sur, Tecnológico, 64849, Monterrey, N.L., Mexico.
General Motors, Pontiac, MI, USA.
BMC Med Inform Decis Mak. 2022 Mar 26;22(1):78. doi: 10.1186/s12911-022-01820-x.
The coronavirus (COVID-19) is a novel pandemic and recently we do not have enough knowledge about the virus behaviour and key performance indicators (KPIs) to assess the mortality risk forecast. However, using a lot of complex and expensive biomarkers could be impossible for many low budget hospitals. Timely identification of the risk of mortality of COVID-19 patients (RMCPs) is essential to improve hospitals' management systems and resource allocation standards.
For the mortality risk prediction, this research work proposes a COVID-19 mortality risk calculator based on a deep learning (DL) model and based on a dataset provided by the HM Hospitals Madrid, Spain. A pre-processing strategy for unbalanced classes and feature selection is proposed. To evaluate the proposed methods, an over-sampling Synthetic Minority TEchnique (SMOTE) and data imputation approaches are introduced which is based on the K-nearest neighbour.
A total of 1,503 seriously ill COVID-19 patients having a median age of 70 years old are comprised in the research work, with 927 (61.7%) males and 576 (38.3%) females. A total of 48 features are considered to evaluate the proposed method, and the following results are achieved. It includes the following values i.e., area under the curve (AUC) 0.93, F2 score 0.93, recall 1.00, accuracy, 0.95, precision 0.91, specificity 0.9279 and maximum probability of correct decision (MPCD) 0.93.
The results show that the proposed method is significantly best for the mortality risk prediction of patients with COVID-19 infection. The MPCD score shows that the proposed DL outperforms on every dataset when evaluating even with an over-sampling technique. The benefits of the data imputation algorithm for unavailable biomarker data are also evaluated. Based on the results, the proposed scheme could be an appropriate tool for critically ill Covid-19 patients to assess the risk of mortality and prognosis.
冠状病毒(COVID-19)是一种新型大流行病毒,最近我们对病毒行为和关键绩效指标(KPI)了解不足,无法评估死亡率预测的风险。然而,对于许多预算紧张的医院来说,使用大量复杂且昂贵的生物标志物可能是不可能的。及时识别 COVID-19 患者的死亡风险(RMCPs)对于改善医院的管理系统和资源分配标准至关重要。
为了进行死亡率风险预测,本研究工作提出了一种基于深度学习(DL)模型的 COVID-19 死亡率风险计算器,并基于西班牙马德里 HM 医院提供的数据集。提出了一种针对不平衡类别的预处理策略和特征选择方法。为了评估所提出的方法,引入了基于 K-最近邻的过采样合成少数技术(SMOTE)和数据插补方法。
本研究工作共纳入了 1503 名患有严重 COVID-19 的患者,中位年龄为 70 岁,其中男性 927 例(61.7%),女性 576 例(38.3%)。共考虑了 48 个特征来评估所提出的方法,得到以下结果。包括以下值:曲线下面积(AUC)为 0.93、F2 得分为 0.93、召回率为 1.00、准确率为 0.95、精度为 0.91、特异性为 0.9279 和最大正确决策概率(MPCD)为 0.93。
结果表明,所提出的方法在预测 COVID-19 感染患者的死亡率风险方面具有显著优势。MPCD 评分表明,即使使用过采样技术,所提出的 DL 在评估时也优于每个数据集。还评估了缺失生物标志物数据的数据插补算法的优势。基于这些结果,所提出的方案可以成为评估危重症 COVID-19 患者死亡率和预后风险的一种合适工具。