Department of Minimally Invasive Surgery, The Affiliated Lihuili Hospital, Ningbo University, Ningbo, Zhejiang, China.
Health Science Center, Ningbo University, Ningbo, Zhejiang, China.
J Cancer Res Clin Oncol. 2023 Oct;149(13):11857-11871. doi: 10.1007/s00432-023-05071-9. Epub 2023 Jul 6.
Surgery represents a primary therapeutic approach for borderline resectable and locally advanced pancreatic cancer (BR/LAPC). However, BR/LAPC lesions exhibit high heterogeneity and not all BR/LAPC patients who undergo surgery can derive beneficial outcomes. The present study aims to employ machine learning (ML) algorithms to identify those who would obtain benefits from the primary tumor surgery.
We retrieved clinical data of patients with BR/LAPC from the Surveillance, Epidemiology, and End Results (SEER) database and classified them into surgery and non-surgery groups based on primary tumor surgery status. To eliminate confounding factors, propensity score matching (PSM) was employed. We hypothesized that patients who underwent surgery and had a longer median cancer-specific survival (CSS) than those who did not undergo surgery would certainly benefit from surgical intervention. Clinical and pathological features were utilized to construct six ML models, and model effectiveness was compared through measures such as the area under curve (AUC), calibration plots, and decision curve analysis (DCA). We selected the best-performing algorithm (i.e., XGBoost) to predict postoperative benefits. The SHapley Additive exPlanations (SHAP) approach was used to interpret the XGBoost model. Additionally, data from 53 Chinese patients prospectively collected was used for external validation of the model.
According to the results of the tenfold cross-validation in the training cohort, the XGBoost model yielded the best performance (AUC = 0.823, 95%CI 0.707-0.938). The internal (74.3% accuracy) and external (84.3% accuracy) validation demonstrated the generalizability of the model. The SHAP analysis provided explanations independent of the model, highlighting important factors related to postoperative survival benefits in BR/LAPC, with age, chemotherapy, and radiation therapy being the top three important factors.
By integrating of ML algorithms and clinical data, we have established a highly efficient model to facilitate clinical decision-making and assist clinicians in selecting the population that would benefit from surgery.
手术是治疗交界可切除和局部进展期胰腺癌(BR/LAPC)的主要治疗方法。然而,BR/LAPC 病变表现出高度异质性,并非所有接受手术的 BR/LAPC 患者都能获得有益的结果。本研究旨在采用机器学习(ML)算法来识别那些能从原发肿瘤手术中获益的患者。
我们从监测、流行病学和最终结果(SEER)数据库中检索了 BR/LAPC 患者的临床数据,并根据原发肿瘤手术情况将其分为手术组和非手术组。为了消除混杂因素,采用倾向评分匹配(PSM)。我们假设接受手术且中位癌症特异性生存(CSS)较长的患者肯定会从手术干预中获益。我们利用临床和病理特征构建了六个 ML 模型,并通过曲线下面积(AUC)、校准图和决策曲线分析(DCA)等指标比较模型的有效性。我们选择表现最好的算法(即 XGBoost)来预测术后获益。我们采用 SHapley Additive exPlanations(SHAP)方法来解释 XGBoost 模型。此外,我们还使用 53 例中国患者前瞻性收集的数据对模型进行了外部验证。
根据训练队列的 10 倍交叉验证结果,XGBoost 模型表现最佳(AUC=0.823,95%CI 0.707-0.938)。内部(74.3%准确率)和外部(84.3%准确率)验证表明模型具有通用性。SHAP 分析提供了独立于模型的解释,突出了与 BR/LAPC 术后生存获益相关的重要因素,年龄、化疗和放疗是最重要的前三个因素。
通过整合 ML 算法和临床数据,我们建立了一个高效的模型,以方便临床决策,并帮助临床医生选择受益于手术的人群。