Sun Tao, Yue Xiaofang, Zhang Gong, Lin Qinyan, Chen Xiao, Huang Tiancha, Li Xiang, Liu Weiwei, Tao Zhihua
The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China.
The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China.
Clin Chim Acta. 2024 Jun 1;559:119705. doi: 10.1016/j.cca.2024.119705. Epub 2024 May 1.
Early recognition and timely intervention for AKI in critically ill patients were crucial to reduce morbidity and mortality. This study aimed to use biomarkers to construct a optimal machine learning model for early prediction of AKI in critically ill patients within seven days.
The prospective cohort study enrolled 929 patients altogether who were admitted in ICU including 680 patients in training set (Jiefang Campus) and 249 patients in external testing set (Binjiang Campus). After performing strict inclusion and exclusion criteria, 421 patients were selected in training set for constructing predictive model and 167 patients were selected in external testing for evaluating the predictive performance of resulting model. Urine and blood samples were collected for kidney injury associated biomarkers detection. Baseline clinical information and laboratory data of the study participants were collected. We determined the average prediction efficiency of six machine learning models through 10-fold cross validation.
In total, 78 variables were collected when admission in ICU and 43 variables were statistically significant between AKI and non-AKI cohort. Then, 35 variables were selected as independent features for AKI by univariate logistic regression. Spearman correlation analysis was used to remove two highly correlated variables. Three ranking methods were used to explore the influence of 33 variables for further determining the best combination of variables. The gini importance ranking method was found to be applicable for variables filtering. The predictive performance of AKIML which constructed by the XGBoost algorithm was the best among six machine learning models. When the AKIML included the nine features (NGAL, IGFBP7, sCysC, CAF22, KIM-1, NT-proBNP, IL-6, IL-18 and L-FABP) with the highest influence ranking, its model had the best prediction performance, with an AUC of 0.881 and an accuracy of 0.815 in training set, similarly, with an AUC of 0.889 and an accuracy of 0.846 in validation set. Moreover, the performace was slightly outperformed in testing set with an AUC of 0.902 and an accuracy of 0.846. The SHAP algorithm was used to interpret the prediction results of AKIML. The web-calculator of AKIML was shown for predicting AKI with more convenient(https://www.xsmartanalysis.com/model/list/predict/model/html?mid=8065&symbol=11gk693982SU6AE1ms21). AKIML was better than the optimal model built with only routine tests for predicting AKI in critically ill patients within 7 days.
The model AKIML constructed by the XGBoost algorithm with selecting the nine most influential biomarkers in the gini importance ranking method had the best performance in predicting AKI in critically ill patients within 7 days. This data-driven predictive model will help clinicians to make quick and accurate diagnosis.
对危重症患者的急性肾损伤(AKI)进行早期识别和及时干预对于降低发病率和死亡率至关重要。本研究旨在利用生物标志物构建一个最佳机器学习模型,用于在7天内对危重症患者的AKI进行早期预测。
前瞻性队列研究共纳入929例入住重症监护病房(ICU)的患者,其中训练集(解放校区)680例,外部测试集(滨江校区)249例。在执行严格的纳入和排除标准后,训练集中选取421例患者构建预测模型,外部测试中选取167例患者评估所得模型的预测性能。收集尿液和血液样本用于检测与肾损伤相关的生物标志物。收集研究参与者的基线临床信息和实验室数据。我们通过10倍交叉验证确定了六种机器学习模型的平均预测效率。
共收集到78个入住ICU时的变量,其中43个变量在AKI组和非AKI组之间具有统计学意义。然后,通过单因素逻辑回归选择35个变量作为AKI的独立特征。采用Spearman相关分析去除两个高度相关的变量。使用三种排序方法探索33个变量的影响,以进一步确定最佳变量组合。发现基尼重要性排序方法适用于变量筛选。在六种机器学习模型中,由XGBoost算法构建的AKIML预测性能最佳。当AKIML包含影响排名最高的九个特征(中性粒细胞明胶酶相关脂质运载蛋白(NGAL)、胰岛素样生长因子结合蛋白7(IGFBP7)、胱抑素C(sCysC)、CAF22、肾损伤分子-1(KIM-1)、N末端脑钠肽前体(NT-proBNP)、白细胞介素-6(IL-6)、白细胞介素-18(IL-18)和肝型脂肪酸结合蛋白(L-FABP))时,其模型具有最佳预测性能,训练集中曲线下面积(AUC)为0.881,准确率为0.815,同样,验证集中AUC为0.889,准确率为0.846。此外,测试集中性能略优,AUC为0.902,准确率为0.846。使用SHAP算法解释AKIML的预测结果。展示了AKIML的网络计算器,用于更方便地预测AKI(https://www.xsmartanalysis.com/model/list/predict/model/html?mid=8065&symbol=11gk693982SU6AE1ms21)。在预测7天内危重症患者的AKI方面,AKIML优于仅使用常规检查构建的最佳模型。
通过XGBoost算法构建的AKIML模型,在基尼重要性排序方法中选择九个最具影响力的生物标志物,在预测7天内危重症患者的AKI方面具有最佳性能。这个数据驱动的预测模型将有助于临床医生做出快速准确的诊断。