Li Zihan, Zhang Yibo, Chen Zixiang, Chen Jiangming, Hou Hui, Wang Cheng, Lu Zheng, Wang Xiaoming, Geng Xiaoping, Liu Fubao
Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei, China.
Cardiology Division, Department of Medicine, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong SAR, China.
Front Digit Health. 2024 Nov 27;6:1510674. doi: 10.3389/fdgth.2024.1510674. eCollection 2024.
Methods for accurately predicting the prognosis of patients with recurrent hepatolithiasis (RH) after biliary surgery are lacking. This study aimed to develop a model that dynamically predicts the risk of hepatolithiasis recurrence using a machine-learning (ML) approach based on multiple clinical high-order correlation data.
Data from patients with RH who underwent surgery at five centres between January 2015 and December 2020 were collected and divided into training and testing sets. Nine predictive models, which we named the Correlation Analysis and Recurrence Evaluation System (CARES), were developed and compared using machine learning (ML) methods to predict the patients' dynamic recurrence risk within 5 post-operative years. We adopted a k-fold cross validation with k = 10 and tested model performance on a separate testing set. The area under the receiver operating characteristic curve was used to evaluate the performance of the models, and the significance and direction of each predictive variable were interpreted and justified based on Shapley Additive Explanations.
Models based on ML methods outperformed those based on traditional regression analysis in predicting the recurrent risk of patients with RH, with Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) showing the best performance, both yielding an AUC (Area Under the receiver operating characteristic Curve) of∼0.9 or higher at predictions. These models were proved to have even better performance on testing sets than in a 10-fold cross validation, indicating that the model was not overfitted. The SHAP method revealed that immediate stone clearance, final stone clearance, number of previous surgeries, and preoperative CA19-9 index were the most important predictors of recurrence after reoperation in RH patients. An online version of the CARES model was implemented.
The CARES model was firstly developed based on ML methods and further encapsulated into an online version for predicting the recurrence of patients with RH after hepatectomy, which can guide clinical decision-making and personalised postoperative surveillance.
目前缺乏准确预测复发性肝内胆管结石(RH)患者胆道手术后预后的方法。本研究旨在开发一种模型,该模型基于多个临床高阶相关数据,采用机器学习(ML)方法动态预测肝内胆管结石复发风险。
收集2015年1月至2020年12月期间在五个中心接受手术的RH患者的数据,并分为训练集和测试集。使用机器学习(ML)方法开发并比较了九个预测模型,我们将其命名为相关性分析和复发评估系统(CARES),以预测患者术后5年内的动态复发风险。我们采用k = 10的k折交叉验证,并在单独的测试集上测试模型性能。使用受试者工作特征曲线下面积来评估模型性能,并基于Shapley加法解释对每个预测变量的显著性和方向进行解释和论证。
在预测RH患者的复发风险方面,基于ML方法的模型优于基于传统回归分析的模型,极端梯度提升(XGBoost)和轻梯度提升机(LightGBM)表现最佳,预测时的曲线下面积(AUC)均达到约0.9或更高。这些模型在测试集上的表现甚至优于10折交叉验证,表明模型没有过拟合。SHAP方法显示,即刻结石清除、最终结石清除、既往手术次数和术前CA19-9指数是RH患者再次手术后复发的最重要预测因素。CARES模型的在线版本已实施。
CARES模型首次基于ML方法开发,并进一步封装为在线版本,用于预测肝切除术后RH患者的复发情况,可指导临床决策和个性化术后监测。