Cheng S B, Zhao C B, Wu Q, Gou S M, Xiong J X, Yang M, Wang C Y, Wu H S, Yin T
Department of Pancreatic Surgery,Union Hospital,Tongji Medical College,Huazhong University of Science and Technology, Wuhan 430022, China.
Zhonghua Wai Ke Za Zhi. 2024 Oct 1;62(10):929-937. doi: 10.3760/cma.j.cn112139-20240411-00180.
To construct an ensemble machine learning model for predicting the occurrence of clinically relevant postoperative pancreatic fistula (CR-POPF) after pancreaticoduodenectomy and evaluate its application value. This is a research on predictive model. Clinical data of 421 patients undergoing pancreaticoduodenectomy in the Department of Pancreatic Surgery,Union Hospital, Tongji Medical College,Huazhong University of Science and Technology from June 2020 to May 2023 were retrospectively collected. There were 241 males (57.2%) and 180 females (42.8%) with an age of (59.7±11.0)years (range: 12 to 85 years).The research objects were divided into training set (315 cases) and test set (106 cases) by stratified random sampling in the ratio of 3∶1. Recursive feature elimination is used to screen features,nine machine learning algorithms are used to model,three groups of models with better fitting ability are selected,and the ensemble model was constructed by Stacking algorithm for model fusion. The model performance was evaluated by various indexes,and the interpretability of the optimal model was analyzed by Shapley Additive Explanations(SHAP) method. The patients in the test set were divided into different risk groups according to the prediction probability (P) of the alternative pancreatic fistula risk score system (a-FRS). The a-FRS score was validated and the predictive efficacy of the model was compared. Among 421 patients,CR-POPF occurred in 84 cases (20.0%). In the test set,the Stacking ensemble model performs best,with the area under the curve (AUC) of the subject's work characteristic curve being 0.823,the accuracy being 0.83,the F1 score being 0.63,and the Brier score being 0.097. SHAP summary map showed that the top 9 factors affecting CR-POPF after pancreaticoduodenectomy were pancreatic duct diameter,CT value ratio,postoperative serum amylase,IL-6,body mass index,operative time,albumin difference before and after surgery,procalcitonin and IL-10. The effects of each feature on the occurrence of CR-POPF after pancreaticoduodenectomy showed a complex nonlinear relationship. The risk of CR-POPF increased when pancreatic duct diameter<3.5 mm,CT value ratio<0.95,postoperative serum amylase concentration>150 U/L,IL-6 level>280 ng/L,operative time>350 minutes,and albumin decreased by more than 10 g/L. The AUC of a-FRS in the test set was 0.668,and the prediction performance of a-FRS was lower than that of the Stacking ensemble machine learning model. The ensemble machine learning model constructed in this study can predict the occurrence of CR-POPF after pancreaticoduodenectomy,and has the potential to be a tool for personalized diagnosis and treatment after pancreaticoduodenectomy.
构建用于预测胰十二指肠切除术后临床相关胰瘘(CR-POPF)发生的集成机器学习模型,并评估其应用价值。这是一项关于预测模型的研究。回顾性收集了2020年6月至2023年5月在华中科技大学同济医学院附属协和医院胰腺外科接受胰十二指肠切除术的421例患者的临床资料。其中男性241例(57.2%),女性180例(42.8%),年龄为(59.7±11.0)岁(范围:12至85岁)。研究对象按3∶1的比例通过分层随机抽样分为训练集(315例)和测试集(106例)。采用递归特征消除法筛选特征,使用9种机器学习算法进行建模,选择拟合能力较好的三组模型,并通过Stacking算法进行模型融合构建集成模型。通过各项指标评估模型性能,采用Shapley加性解释(SHAP)方法分析最优模型的可解释性。根据替代胰瘘风险评分系统(a-FRS)的预测概率(P)将测试集中的患者分为不同风险组。对a-FRS评分进行验证并比较模型的预测效能。421例患者中,发生CR-POPF的有84例(20.0%)。在测试集中,Stacking集成模型表现最佳,受试者工作特征曲线下面积(AUC)为0.823,准确率为0.83,F1分数为0.63,布里尔分数为0.097。SHAP汇总图显示,影响胰十二指肠切除术后CR-POPF的前9个因素为胰管直径、CT值比、术后血清淀粉酶、白细胞介素-6、体重指数、手术时间、术前术后白蛋白差值、降钙素原和白细胞介素-10。各特征对胰十二指肠切除术后CR-POPF发生的影响呈复杂的非线性关系。当胰管直径<3.5 mm、CT值比<0.95、术后血清淀粉酶浓度>150 U/L、白细胞介素-6水平>280 ng/L、手术时间>350分钟以及白蛋白下降超过10 g/L时,CR-POPF的风险增加。测试集中a-FRS的AUC为0.668,a-FRS的预测性能低于Stacking集成机器学习模型。本研究构建的集成机器学习模型能够预测胰十二指肠切除术后CR-POPF的发生,有潜力成为胰十二指肠切除术后个性化诊疗的工具。