Ghosheh Ghadeer O, Alamad Bana, Yang Kai-Wen, Syed Faisil, Hayat Nasir, Iqbal Imran, Al Kindi Fatima, Al Junaibi Sara, Al Safi Maha, Ali Raghib, Zaher Walid, Al Harbi Mariam, Shamout Farah E
Engineering Division, NYU Abu Dhabi, United Arab Emirates.
Abu Dhabi Health Services, United Arab Emirates.
Intell Based Med. 2022;6:100065. doi: 10.1016/j.ibmed.2022.100065. Epub 2022 Jun 13.
Clinical evidence suggests that some patients diagnosed with coronavirus disease 2019 (COVID-19) experience a variety of complications associated with significant morbidity, especially in severe cases during the initial spread of the pandemic. To support early interventions, we propose a machine learning system that predicts the risk of developing multiple complications. We processed data collected from 3,352 patient encounters admitted to 18 facilities between April 1 and April 30, 2020, in Abu Dhabi (AD), United Arab Emirates. Using data collected during the first 24 h of admission, we trained machine learning models to predict the risk of developing any of three complications after 24 h of admission. The complications include Secondary Bacterial Infection (SBI), Acute Kidney Injury (AKI), and Acute Respiratory Distress Syndrome (ARDS). The hospitals were grouped based on geographical proximity to assess the proposed system's learning generalizability, AD Middle region and AD Western & Eastern regions, A and B, respectively. The overall system includes a data filtering criterion, hyperparameter tuning, and model selection. In test set A, consisting of 587 patient encounters (mean age: 45.5), the system achieved a good area under the receiver operating curve (AUROC) for the prediction of SBI (0.902 AUROC), AKI (0.906 AUROC), and ARDS (0.854 AUROC). Similarly, in test set B, consisting of 225 patient encounters (mean age: 42.7), the system performed well for the prediction of SBI (0.859 AUROC), AKI (0.891 AUROC), and ARDS (0.827 AUROC). The performance results and feature importance analysis highlight the system's generalizability and interpretability. The findings illustrate how machine learning models can achieve a strong performance even when using a limited set of routine input variables. Since our proposed system is data-driven, we believe it can be easily repurposed for different outcomes considering the changes in COVID-19 variants over time.
临床证据表明,一些被诊断为2019冠状病毒病(COVID-19)的患者会出现各种与严重发病相关的并发症,尤其是在疫情初期传播期间的重症病例中。为了支持早期干预,我们提出了一个机器学习系统,该系统可以预测发生多种并发症的风险。我们处理了2020年4月1日至4月30日期间从阿拉伯联合酋长国阿布扎比(AD)的18家医疗机构收治的3352例患者的数据。利用入院后前24小时收集的数据,我们训练了机器学习模型,以预测入院24小时后发生三种并发症中任何一种的风险。这些并发症包括继发性细菌感染(SBI)、急性肾损伤(AKI)和急性呼吸窘迫综合征(ARDS)。医院根据地理位置进行分组,以评估所提出系统的学习通用性,分别为AD中部地区和AD西部及东部地区,A组和B组。整个系统包括数据过滤标准、超参数调整和模型选择。在由587例患者(平均年龄:45.5岁)组成的测试集A中,该系统在预测SBI(曲线下面积为0.902)、AKI(曲线下面积为0.906)和ARDS(曲线下面积为0.854)方面取得了良好的受试者操作特征曲线下面积(AUROC)。同样,在由225例患者(平均年龄:42.7岁)组成的测试集B中,该系统在预测SBI(曲线下面积为0.859)、AKI(曲线下面积为0.891)和ARDS(曲线下面积为0.827)方面表现良好。性能结果和特征重要性分析突出了该系统的通用性和可解释性。研究结果表明,即使使用有限的一组常规输入变量,机器学习模型也能取得强大的性能。由于我们提出的系统是数据驱动的,我们相信考虑到COVID-19变体随时间的变化,它可以很容易地重新用于不同的结果预测。