Qin Xinxin, Qiu Binxu, Ge Litao, Wu Song, Ma Yuye, Li Wei
Department of Gastric and Colorectal Surgery, General Surgery Center, The First Hospital of Jilin University, Changchun, China.
Nanjing Luhe People's Hostipal, General Surgery, Nanjing, China.
Front Oncol. 2024 Dec 5;14:1455914. doi: 10.3389/fonc.2024.1455914. eCollection 2024.
Distant metastasis of gastric cancer can seriously affect the treatment strategy of gastric cancer patients, so it is essential to identify patients at high risk of distant metastasis of gastric cancer earlier.
In this study, we retrospectively collected research data from 18,472 gastric cancer patients from the SEER database. We applied six machine learning algorithms to construct a model that can predict distant metastasis of gastric cancer. We constructed the machine learning model using 10-fold cross-validation. We evaluated the model using the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, and calibration curves. In addition, we used Shapley's addition interpretation (SHAP) to interpret the machine learning model. We used data from 1595 gastric cancer patients in the First Hospital of Jilin University for external validation. We plotted the correlation heat maps of the predictor variables. We selected an optimal model and constructed a web-based online calculator for predicting the risk of distant metastasis of gastric cancer.
The study included 18,472 patients with gastric cancer from the SEER database, including 4,202 (22.75%) patients with distant metastases. The results of multivariate logistic regression analysis showed that age, race, grade of differentiation, tumor size, T stage, radiotherapy, and chemotherapy were independent risk factors for distant metastasis of gastric cancer. In the ten-fold cross-validation of the training set, the average AUC value of the random forest (RF) model was 0.80. The RF model performed best in the internal test set and external validation set. The RF model had an AUC of 0.80, an AUPRC of 0.555, an accuracy of 0.81, and a precision of 0.78 in the internal test set. The RF model had a metric AUC of 0.76 in the external validation set, an AUPRC of 0.496, an accuracy of 0.82, and a precision of 0.81. Finally, we constructed a network calculator for distant metastasis of gastric cancer using the RF model.
With the help of pathological and clinical indicators, we constructed a well-performing RF model for predicting the risk of distant metastasis in gastric cancer patients to help clinicians make clinical decisions.
胃癌的远处转移会严重影响胃癌患者的治疗策略,因此尽早识别胃癌远处转移的高危患者至关重要。
在本研究中,我们回顾性收集了来自监测、流行病学与最终结果(SEER)数据库的18472例胃癌患者的研究数据。我们应用六种机器学习算法构建了一个能够预测胃癌远处转移的模型。我们使用10折交叉验证构建机器学习模型。我们使用受试者工作特征曲线下面积(AUC)、精确召回率曲线下面积(AUPRC)、决策曲线分析和校准曲线对模型进行评估。此外,我们使用夏普利值加法解释(SHAP)来解释机器学习模型。我们使用吉林大学第一医院1595例胃癌患者的数据进行外部验证。我们绘制了预测变量的相关热图。我们选择了一个最优模型,并构建了一个基于网络的在线计算器来预测胃癌远处转移的风险。
该研究纳入了SEER数据库中的18472例胃癌患者,其中4202例(22.75%)发生远处转移。多因素逻辑回归分析结果显示,年龄、种族、分化程度、肿瘤大小、T分期、放疗和化疗是胃癌远处转移的独立危险因素。在训练集的10折交叉验证中,随机森林(RF)模型的平均AUC值为0.80。RF模型在内部测试集和外部验证集中表现最佳。RF模型在内部测试集中的AUC为0.80,AUPRC为0.555,准确率为0.81,精确率为0.78。RF模型在外部验证集中的AUC为0.76,AUPRC为0.496,准确率为0.82,精确率为0.81。最后,我们使用RF模型构建了一个胃癌远处转移的网络计算器。
借助病理和临床指标,我们构建了一个性能良好的RF模型,用于预测胃癌患者远处转移的风险,以帮助临床医生做出临床决策。