Tezza Fabiana, Lorenzoni Giulia, Azzolina Danila, Barbar Sofia, Leone Lucia Anna Carmela, Gregori Dario
Geriatric Unit, Ospedali Riuniti di Padova Sud, AULSS 6 Euganea, 35043 Monselice, Italy.
Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
J Pers Med. 2021 Apr 24;11(5):343. doi: 10.3390/jpm11050343.
The present work aims to identify the predictors of COVID-19 in-hospital mortality testing a set of Machine Learning Techniques (MLTs), comparing their ability to predict the outcome of interest. The model with the best performance will be used to identify in-hospital mortality predictors and to build an in-hospital mortality prediction tool. The study involved patients with COVID-19, proved by PCR test, admitted to the "Ospedali Riuniti Padova Sud" COVID-19 referral center in the Veneto region, Italy. The algorithms considered were the Recursive Partition Tree (RPART), the Support Vector Machine (SVM), the Gradient Boosting Machine (GBM), and Random Forest. The resampled performances were reported for each MLT, considering the sensitivity, specificity, and the Receiving Operative Characteristic (ROC) curve measures. The study enrolled 341 patients. The median age was 74 years, and the male gender was the most prevalent. The Random Forest algorithm outperformed the other MLTs in predicting in-hospital mortality, with a ROC of 0.84 (95% C.I. 0.78-0.9). Age, together with vital signs (oxygen saturation and the quick SOFA) and lab parameters (creatinine, AST, lymphocytes, platelets, and hemoglobin), were found to be the strongest predictors of in-hospital mortality. The present work provides insights for the prediction of in-hospital mortality of COVID-19 patients using a machine-learning algorithm.
本研究旨在通过测试一组机器学习技术(MLT)来识别新冠肺炎院内死亡的预测因素,比较它们预测感兴趣结果的能力。性能最佳的模型将用于识别院内死亡预测因素并构建院内死亡预测工具。该研究纳入了经聚合酶链反应(PCR)检测证实为新冠肺炎的患者,这些患者被收治于意大利威尼托地区“帕多瓦南部联合医院”新冠肺炎转诊中心。所考虑的算法有递归划分树(RPART)、支持向量机(SVM)、梯度提升机(GBM)和随机森林。报告了每种MLT重新采样后的性能,包括敏感性、特异性和接受操作特征(ROC)曲线测量值。该研究共纳入341例患者。中位年龄为74岁,男性最为常见。随机森林算法在预测院内死亡方面优于其他MLT,ROC为0.84(95%置信区间0.78 - 0.9)。年龄,连同生命体征(血氧饱和度和快速序贯器官衰竭评估(SOFA))以及实验室参数(肌酐、谷草转氨酶、淋巴细胞、血小板和血红蛋白),被发现是院内死亡的最强预测因素。本研究为使用机器学习算法预测新冠肺炎患者的院内死亡提供了见解。