Pattharanitima Pattharawin, Thongprayoon Charat, Kaewput Wisit, Qureshi Fawad, Qureshi Fahad, Petnak Tananchai, Srivali Narat, Gembillo Guido, O'Corragain Oisin A, Chesdachai Supavit, Vallabhajosyula Saraschandra, Guru Pramod K, Mao Michael A, Garovic Vesna D, Dillon John J, Cheungpasitporn Wisit
Department of Internal Medicine, Faculty of Medicine, Thammasat University, Pathum Thani 12121, Thailand.
Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, MN 55905, USA.
J Clin Med. 2021 Oct 28;10(21):5021. doi: 10.3390/jcm10215021.
Lactic acidosis is the most common cause of anion gap metabolic acidosis in the intensive care unit (ICU), associated with poor outcomes including mortality. We sought to compare machine learning (ML) approaches versus logistic regression analysis for prediction of mortality in lactic acidosis patients admitted to the ICU.
We used the Medical Information Mart for Intensive Care (MIMIC-III) database to identify ICU adult patients with lactic acidosis (serum lactate ≥4 mmol/L). The outcome of interest was hospital mortality. We developed prediction models using four ML approaches consisting of random forest (RF), decision tree (DT), extreme gradient boosting (XGBoost), artificial neural network (ANN), and statistical modeling with forward stepwise logistic regression using the testing dataset. We then assessed model performance using area under the receiver operating characteristic curve (AUROC), accuracy, precision, error rate, Matthews correlation coefficient (MCC), F1 score, and assessed model calibration using the Brier score, in the independent testing dataset.
Of 1919 lactic acidosis ICU patients, 1535 and 384 were included in the training and testing dataset, respectively. Hospital mortality was 30%. RF had the highest AUROC at 0.83, followed by logistic regression 0.81, XGBoost 0.81, ANN 0.79, and DT 0.71. In addition, RF also had the highest accuracy (0.79), MCC (0.45), F1 score (0.56), and lowest error rate (21.4%). The RF model was the most well-calibrated. The Brier score for RF, DT, XGBoost, ANN, and multivariable logistic regression was 0.15, 0.19, 0.18, 0.19, and 0.16, respectively. The RF model outperformed multivariable logistic regression model, SOFA score (AUROC 0.74), SAP II score (AUROC 0.77), and Charlson score (AUROC 0.69).
The ML prediction model using RF algorithm provided the highest predictive performance for hospital mortality among ICU patient with lactic acidosis.
乳酸酸中毒是重症监护病房(ICU)中阴离子间隙代谢性酸中毒最常见的原因,与包括死亡率在内的不良预后相关。我们试图比较机器学习(ML)方法与逻辑回归分析对入住ICU的乳酸酸中毒患者死亡率的预测能力。
我们使用重症监护医学信息集市(MIMIC-III)数据库来识别患有乳酸酸中毒(血清乳酸≥4 mmol/L)的ICU成年患者。感兴趣的结局是医院死亡率。我们使用四种ML方法构建预测模型,包括随机森林(RF)、决策树(DT)、极端梯度提升(XGBoost)、人工神经网络(ANN),并使用测试数据集通过向前逐步逻辑回归进行统计建模。然后,我们在独立测试数据集中使用受试者操作特征曲线下面积(AUROC)、准确性、精确性、错误率、马修斯相关系数(MCC)、F1分数评估模型性能,并使用布里尔分数评估模型校准。
在纳入的1919例ICU乳酸酸中毒患者中,分别有1535例和384例被纳入训练和测试数据集。医院死亡率为30%。RF的AUROC最高,为0.83,其次是逻辑回归0.81、XGBoost 0.81、ANN 0.79和DT 0.71。此外,RF的准确性(0.79)、MCC(0.45)、F1分数(0.56)也最高,错误率(21.4%)最低。RF模型校准效果最佳。RF、DT、XGBoost、ANN和多变量逻辑回归的布里尔分数分别为0.15、0.19、0.18、0.19和0.16。RF模型优于多变量逻辑回归模型以及序贯器官衰竭评估(SOFA)评分(AUROC 0.74)、简化急性生理学评分II(SAP II)(AUROC 0.77)和查尔森评分(AUROC 0.69)。
使用RF算法的ML预测模型对ICU乳酸酸中毒患者的医院死亡率具有最高的预测性能。