Zhu Hao, Zhou Yiyan, Shen Danyang, Wu Kejia, Gan Xiaojie, Xue Xiaofeng, Zhang Weigang, Yang Xiaohua, Qiu Junyi, Sun Ding
Department of General Surgery, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China.
Department of General Surgery, Affiliated Hospital of Nantong University, Nantong, Jiangsu Province, China.
BMC Cancer. 2025 Jul 1;25(1):1117. doi: 10.1186/s12885-025-14503-3.
Liver metastasis is the most frequent site of distant metastasis in pancreatic ductal adenocarcinoma (PDAC), significantly contributing to poor prognosis. This study aims to develop and validate a machine learning (ML) model for predicting early liver metastasis (ELM) following pancreatic cancer surgery.
This retrospective study included 407 pancreatic cancer patients who underwent surgery at the First Affiliated Hospital of Soochow University between January 2015 and December 2023, aiming to develop and validate a predictive model. Seven ML algorithms were employed to predict the risk of liver metastasis within one year after surgery. The training cohort (n = 284) was used for model development and hyperparameter tuning, while the internal validation cohort (n = 123) was employed to assess predictive performance. Shapley additive explanations (SHAP) were applied to elucidate the decision-making process of the best-performing model. To assess the generalizability of the model, 131 PDAC patients from the Affiliated Hospital of Nantong University were included as an external validation cohort.
A total of 194 patients (36.1%) were diagnosed with ELM during the 1-year postoperative follow-up across the two centers. Out of 22 disease characteristics, nine key features were selected for the development of the model. XGBoost exhibited the highest performance, achieving an AUC of 0.901, accuracy of 0.846, sensitivity of 0.756, specificity of 0.897, and an F1 score of 0.782. The Brier score of 0.12 indicated excellent calibration. Furthermore, both the internal and external validation datasets demonstrated consistent and robust performance, as evidenced by ROC curves, calibration plots, decision curves, and clinical impact curves, thereby supporting its clinical utility.
An XGBoost model was developed to predict the likelihood of ELM after PDAC surgery with high accuracy. Additionally, the model was implemented as an application, providing clinicians with an accessible visual tool to support personalized clinical strategies and ultimately enhance patient outcomes.
肝转移是胰腺导管腺癌(PDAC)最常见的远处转移部位,显著导致预后不良。本研究旨在开发并验证一种用于预测胰腺癌手术后早期肝转移(ELM)的机器学习(ML)模型。
这项回顾性研究纳入了2015年1月至2023年12月期间在苏州大学附属第一医院接受手术的407例胰腺癌患者,旨在开发并验证一个预测模型。采用七种ML算法来预测术后一年内肝转移的风险。训练队列(n = 284)用于模型开发和超参数调整,而内部验证队列(n = 123)用于评估预测性能。应用Shapley加法解释(SHAP)来阐明表现最佳模型的决策过程。为了评估该模型的通用性,将南通大学附属医院的131例PDAC患者纳入作为外部验证队列。
在两个中心的1年术后随访期间,共有194例患者(36.1%)被诊断为ELM。在22个疾病特征中,选择了9个关键特征用于模型开发。XGBoost表现出最高的性能,曲线下面积(AUC)为0.901,准确率为0.846,灵敏度为0.756,特异性为0.897,F1分数为0.782。Brier分数为0.12表明校准良好。此外,内部和外部验证数据集均显示出一致且稳健的性能,如ROC曲线、校准图、决策曲线和临床影响曲线所示,从而支持其临床实用性。
开发了一种XGBoost模型以高精度预测PDAC手术后ELM的可能性。此外,该模型被实现为一个应用程序,为临床医生提供了一个易于使用的可视化工具,以支持个性化临床策略并最终改善患者预后。