Li Congyang, Wang Chenggang, Zhang Jiru, Zheng Wenjun, Shi Jing, Li Li, Shi Xuezhi
Department of Orthopaedics, Lu'an Hospital of Anhui Medical University, Lu'an, China.
Wound Stoma Care Clinic, Lu'an Hospital of Anhui Medical University, Anhui, China.
Front Med (Lausanne). 2025 Jun 26;12:1553274. doi: 10.3389/fmed.2025.1553274. eCollection 2025.
Currently, there is no individualized prediction model for joint function recovery after ankle fracture surgery. This study aims to develop a prediction model for poor recovery following ankle fracture surgery using various machine learning algorithms to facilitate early identification of high-risk patients.
A total of 750 patients who underwent ankle fracture surgery at Lu'an Hospital Affiliated to Anhui Medical University between January 2018 and December 2023 were followed up. The collected data were chronologically divided into a training set (599 cases) and a test set (151 cases). Feature variables were selected using the Boruta algorithm, and five machine learning algorithms (logistic regression, random forest, extreme gradient boosting, support vector machine, and lasso-stacking) were employed to construct models. The performance of these models was compared on both the training and test sets to select the best-performing model. The decision basis of the optimal model was further analyzed using Shapley Additive Explanation (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME).
In total, 12 characteristic variables were identified using the Boruta algorithm. Among the five machine learning models, random forest model: AUC (training set: 0.840, test set: 0.779), accuracy (training set: 0.781, test set: 0.742); SVM: AUC (training set: 0.809, test set: 0.768), accuracy (training set: 0.751, test set: 0.728); XGBoost: AUC (training set: 0.734, test set: 0.748), accuracy (training set: 0.668, test set: 0.722); logistic regression: AUC (training set: 0.672, test set: 0.691), accuracy (training set: 0.651, test set: 0.656); lasso-stacking model: AUC (training set: 0.877, test set: 0.791), accuracy (training set: 0.796, test set: 0.762). The PR curve and decision curve of the lasso-stacking model were better than those of other models. The lasso-stacking model had the best performance. SHAP analysis showed that functional exercise compliance, combined ligament injury, and open fracture accounted for the largest proportion of SHAP values and were the most important influencing factors.
Through evaluation and comparison of the developed models, the lasso-stacking model demonstrated the best performance and is more suitable for predicting joint function recovery after ankle surgery. This model can be further validated externally and applied in clinical practice.
目前尚无用于预测踝关节骨折手术后关节功能恢复情况的个体化模型。本研究旨在利用多种机器学习算法开发一种预测踝关节骨折手术后恢复不佳的模型,以促进对高危患者的早期识别。
对2018年1月至2023年12月在安徽医科大学附属六安医院接受踝关节骨折手术的750例患者进行随访。将收集到的数据按时间顺序分为训练集(599例)和测试集(151例)。使用Boruta算法选择特征变量,并采用五种机器学习算法(逻辑回归、随机森林、极端梯度提升、支持向量机和套索堆叠)构建模型。在训练集和测试集上比较这些模型的性能,以选择性能最佳的模型。使用夏普利值相加解释法(SHAP)和局部可解释模型无关解释法(LIME)进一步分析最佳模型的决策依据。
使用Boruta算法共识别出12个特征变量。在五个机器学习模型中,随机森林模型:AUC(训练集:0.840,测试集:0.779),准确率(训练集:0.781,测试集:0.742);支持向量机:AUC(训练集:0.809,测试集:0.768),准确率(训练集:0.751,测试集:0.728);极端梯度提升:AUC(训练集:0.734,测试集:0.748),准确率(训练集:0.668,测试集:0.722);逻辑回归:AUC(训练集:0.672,测试集:0.691),准确率(训练集:0.651,测试集:0.656);套索堆叠模型:AUC(训练集:0.877,测试集:0.791),准确率(训练集:0.796,测试集:0.762)。套索堆叠模型的PR曲线和决策曲线优于其他模型。套索堆叠模型性能最佳。SHAP分析表明,功能锻炼依从性、合并韧带损伤和开放性骨折在SHAP值中占比最大,是最重要的影响因素。
通过对所开发模型的评估和比较,套索堆叠模型表现出最佳性能,更适合预测踝关节手术后的关节功能恢复情况。该模型可进一步进行外部验证并应用于临床实践。