Zhao Lulu, Niu Penghui, Wang Wanqing, Han Xue, Luan Xiaoyi, Huang Huang, Zhang Yawei, Zhao Dongbing, Gao Jidong, Chen Yingtai
Department of Pancreatic and Gastric Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Department of Cancer Prevention and Control, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
J Natl Cancer Cent. 2024 Mar 12;4(2):142-152. doi: 10.1016/j.jncc.2024.01.007. eCollection 2024 Jun.
Accurate prognosis prediction is critical for individualized-therapy making of gastric cancer patients. We aimed to develop and test 6-month, 1-, 2-, 3-, 5-, and 10-year overall survival (OS) and cancer-specific survival (CSS) prediction models for gastric cancer patients following gastrectomy.
We derived and tested Survival Quilts, a machine learning-based model, to develop 6-month, 1-, 2-, 3-, 5-, and 10-year OS and CSS prediction models. Gastrectomy patients in the development set ( = 20,583) and the internal validation set ( = 5,106) were recruited from the Surveillance, Epidemiology, and End Results (SEER) database, while those in the external validation set ( = 6,352) were recruited from the China National Cancer Center Gastric Cancer (NCCGC) database. Furthermore, we selected gastrectomy patients without neoadjuvant therapy as a subgroup to train and test the prognostic models in order to keep the accuracy of tumor-node-metastasis (TNM) stage. Prognostic performances of these OS and CSS models were assessed using the Concordance Index (C-index) and area under the curve (AUC) values.
The machine learning model had a consistently high accuracy in predicting 6-month, 1-, 2-, 3-, 5-, and 10-year OS in the SEER development set (C-index = 0.861, 0.832, 0.789, 0.766, 0.740, and 0.709; AUC = 0.784, 0.828, 0.840, 0.849, 0.869, and 0.902, respectively), SEER validation set (C-index = 0.782, 0.739, 0.712, 0.698, 0.681, and 0.660; AUC = 0.751, 0.772, 0.767, 0.762, 0.766, and 0.787, respectively), and NCCGC set (C-index = 0.691, 0.756, 0.751, 0.737, 0.722, and 0.701; AUC = 0.769, 0.788, 0.790, 0.790, 0.787, and 0.788, respectively). The model was able to predict 6-month, 1-, 2-, 3-, 5-, and 10-year CSS in the SEER development set (C-index = 0.879, 0.858, 0.820, 0.802, 0.784, and 0.774; AUC = 0.756, 0.827, 0.852, 0.863, 0.874, and 0.884, respectively) and SEER validation set (C-index = 0.790, 0.763, 0.741, 0.729, 0.718, and 0.708; AUC = 0.706, 0.758, 0.767, 0.766, 0.766, and 0.764, respectively). In multivariate analysis, the high-risk group with risk score output by 5-year OS model was proved to be a strong survival predictor both in the SEER development set (hazard ratio [HR] = 14.59, 95% confidence interval [CI]: 1.872-2.774, < 0.001), SEER validation set (HR = 2.28, 95% CI: 13.089-16.293, < 0.001), and NCCGC set (HR = 1.98, 95% CI: 1.617-2.437, 0.001). We further explored the prognostic value of risk score resulted 5-year CSS model of gastrectomy patients, and found that high-risk group remained as an independent CSS factor in the SEER development set (HR = 12.81, 95% CI: 11.568-14.194, < 0.001) and SEER validation set (HR = 1.61, 95% CI: 1.338-1.935, < 0.001).
Survival Quilts could allow accurate prediction of 6-month, 1-, 2-, 3-, 5-, and 10-year OS and CSS in gastric cancer patients following gastrectomy.
准确的预后预测对于胃癌患者的个体化治疗决策至关重要。我们旨在开发并测试胃癌患者胃切除术后6个月、1年、2年、3年、5年和10年的总生存(OS)和癌症特异性生存(CSS)预测模型。
我们推导并测试了基于机器学习的“生存拼布”模型,以开发6个月、1年、2年、3年、5年和10年的OS和CSS预测模型。开发集(n = 20,583)和内部验证集(n = 5,106)中的胃切除患者来自监测、流行病学和最终结果(SEER)数据库,而外部验证集(n = 6,352)中的患者来自中国国家癌症中心胃癌(NCCGC)数据库。此外,我们选择未接受新辅助治疗的胃切除患者作为亚组来训练和测试预后模型,以保持肿瘤-淋巴结-转移(TNM)分期的准确性。使用一致性指数(C指数)和曲线下面积(AUC)值评估这些OS和CSS模型的预后性能。
该机器学习模型在SEER开发集(C指数分别为0.861、0.832、0.789、0.766、0.740和0.709;AUC分别为0.784、0.828、0.840、0.849、0.869和0.902)、SEER验证集(C指数分别为0.782、0.739、0.712、0.698、0.681和0.660;AUC分别为0.751、0.772、0.767、0.762、0.766和0.787)以及NCCGC集(C指数分别为0.691、0.756、0.751、0.737、0.722和0.701;AUC分别为0.769、0.788、0.790、0.790、0.787和0.788)中预测6个月、1年、2年、3年、5年和10年OS时具有始终如一的高准确性。该模型能够在SEER开发集(C指数分别为0.879、0.858、0.820、0.802、0.784和0.774;AUC分别为0.756、0.827、0.852、0.863、0.874和0.884)和SEER验证集(C指数分别为0.790、0.763、0.741、0.729、0.718和0.708;AUC分别为0.706、0.758、0.767、0.766、0.766和0.764)中预测6个月、1年、2年、3年、5年和10年CSS。在多变量分析中,5年OS模型输出的风险评分高风险组在SEER开发集(风险比[HR]=14.59,95%置信区间[CI]:1.872 - 2.774,P<0.001)、SEER验证集(HR = 2.28,95%CI:13.089 - 16.293,P<0.001)和NCCGC集(HR = 1.98,95%CI:1.617 - 2.437,P<0.001)中均被证明是强有力的生存预测指标。我们进一步探讨了胃切除患者5年CSS模型得出的风险评分的预后价值,发现高风险组在SEER开发集(HR = 12.81,95%CI:11.568 - 14.194,P<0.001)和SEER验证集(HR = 1.61,95%CI:1.338 - 1.935,P<0.001)中仍然是独立的CSS因素。
“生存拼布”模型能够准确预测胃癌患者胃切除术后6个月、1年、2年、3年、5年和10年的OS和CSS。