Tao Haoran, You Lili, Huang Yuhan, Chen Yunxiang, Yan Li, Liu Dan, Xiao Shan, Yuan Bichai, Ren Meng
Department of Endocrinology, Sun Yat-Sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
Guangdong Clinical Research Center for Metabolic Diseases, Guangzhou Key Laboratory for Metabolic Diseases, Guangzhou, China.
Front Endocrinol (Lausanne). 2025 Mar 25;16:1526098. doi: 10.3389/fendo.2025.1526098. eCollection 2025.
Diabetic foot ulcers (DFUs) constitute a significant complication among individuals with diabetes and serve as a primary cause of nontraumatic lower-extremity amputation (LEA) within this population. We aimed to develop machine learning (ML) models to predict the risk of LEA in DFU patients and used SHapley additive explanations (SHAPs) to interpret the model.
In this retrospective study, data from 1,035 patients with DFUs at Sun Yat-sen Memorial Hospital were utilized as the training cohort to develop the ML models. Data from 297 patients across multiple tertiary centers were used for external validation. We then used least absolute shrinkage and selection operator analysis to identify predictors of amputation. We developed five ML models [logistic regression (LR), support vector machine (SVM), random forest (RF), k-nearest neighbors (KNN) and extreme gradient boosting (XGBoost)] to predict LEA in DFU patients. The performance of these models was evaluated using several metrics, including the area under the receiver operating characteristic curve (AUC), decision curve analysis (DCA), precision, recall, accuracy, and F1 score. Finally, the SHAP method was used to ascertain the significance of the features and to interpret the model.
In the final cohort comprising 1332 individuals, 600 patients underwent amputation. Following hyperparameter optimization, the XGBoost model achieved the best amputation prediction performance with an accuracy of 0.94, a precision of 0.96, an F1 score of 0.94 and an AUC of 0.93 for the internal validation set on the basis of the 17 features. For the external validation set, the model attained an accuracy of 0.78, a precision of 0.93, an F1 score of 0.78, and an AUC of 0.83. Through SHAP analysis, we identified white blood cell counts, lymphocyte counts, and blood urea nitrogen levels as the model's main predictors.
The XGBoost algorithm-based prediction model can be used to dynamically estimate the risk of LEA in DFU patients, making it a valuable tool for preventing the progression of DFUs to amputation.
糖尿病足溃疡(DFU)是糖尿病患者的一种严重并发症,也是该人群非创伤性下肢截肢(LEA)的主要原因。我们旨在开发机器学习(ML)模型来预测DFU患者发生LEA的风险,并使用Shapley加性解释(SHAP)来解释该模型。
在这项回顾性研究中,中山大学附属孙逸仙纪念医院1035例DFU患者的数据被用作训练队列来开发ML模型。来自多个三级中心的297例患者的数据用于外部验证。然后,我们使用最小绝对收缩和选择算子分析来确定截肢的预测因素。我们开发了五个ML模型[逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、k近邻(KNN)和极端梯度提升(XGBoost)]来预测DFU患者的LEA。使用包括受试者操作特征曲线下面积(AUC)、决策曲线分析(DCA)、精确率、召回率、准确率和F1分数等多个指标评估这些模型的性能。最后,使用SHAP方法确定特征的重要性并解释模型。
在最终包含1332例个体的队列中,600例患者接受了截肢手术。经过超参数优化后,基于17个特征的XGBoost模型在内部验证集中实现了最佳的截肢预测性能,准确率为0.94,精确率为0.96,F1分数为0.94,AUC为0.93。对于外部验证集,该模型的准确率为0.78,精确率为0.93,F1分数为0.78,AUC为0.83。通过SHAP分析,我们确定白细胞计数、淋巴细胞计数和血尿素氮水平为模型的主要预测因素。
基于XGBoost算法的预测模型可用于动态估计DFU患者发生LEA的风险,使其成为预防DFU进展为截肢的有价值工具。