The Marine Biomedical Research Institute, Guangdong Medical University, Zhanjiang, 524023, Guangdong, China.
Southern Marine Science and Engineering Guangdong Laboratory (Zhanjiang), Zhanjiang, 524023, Guangdong, China.
Clin Exp Med. 2023 Sep;23(5):1609-1620. doi: 10.1007/s10238-022-00858-5. Epub 2022 Jul 11.
Previous studies have revealed an increased risk of secondary primary cancers (SPC) after lung cancer. The prognostic prediction models for SPC patients after lung cancer are particularly needed to guide screening. Therefore, we study retrospectively analyzed the Surveillance, Epidemiology, and End Results (SEER) database using classical statistics and machine learning to explore the risk factors and construct a novel overall survival (OS) prediction nomogram for patients with SPC after lung cancer. Data of patients with SPC after lung cancer, covering 2000 to 2016, were gathered from the SEER database. The incidence of SPC after lung cancer was calculated by Standardized incidence ratios (SIRs). Cox proportional hazards regression, machine learning (ML), Kaplan-Meier (KM) methods, and log-rank tests were conducted to identify the important prognostic factors for predicting OS. These significant prognostic factors were used for the development of an OS prediction nomogram. Totally, 10,487 SPC samples were randomly divided into training and validation cohorts (model construction and internal validation) from the SEER database. In the random forest (RF) and extreme gradient boosting (XGBoost) feature importance ranking models, age was the most important variable which was also reflected in the nomogram. And, the models that combined machine learning with cox proportional hazards had a better predictive performance than the model that only used cox proportional hazards (AUC = 0.762 in RF, AUC = 0.737 in XGBoost, AUC = 0.722 in COX). Calibration curves and decision curve analysis (DCA) curves also revealed that our nomogram has excellent clinical utility. The web-based dynamic nomogram calculator was accessible on https://httseer.shinyapps.io/DynNomapp/ . The prognosis characteristics of SPC following lung cancer were systematically reviewed. The dynamic nomogram we constructed can provide survival predictions to assist clinicians in making individualized decisions.
先前的研究揭示了肺癌患者发生继发性原发性癌症(SPC)的风险增加。因此,特别需要针对肺癌后 SPC 患者的预后预测模型来指导筛查。为此,我们使用经典统计学和机器学习方法对 Surveillance, Epidemiology, and End Results(SEER)数据库进行了回顾性分析,以探讨危险因素,并构建一种用于肺癌后 SPC 患者的新型总生存期(OS)预测列线图。从 SEER 数据库中收集了 2000 年至 2016 年间患有肺癌后 SPC 的患者的数据。通过标准化发病比(SIRs)计算肺癌后 SPC 的发病率。使用 Cox 比例风险回归、机器学习(ML)、Kaplan-Meier(KM)方法和对数秩检验来确定用于预测 OS 的重要预后因素。这些显著的预后因素用于开发 OS 预测列线图。总共,从 SEER 数据库中随机划分了 10487 例 SPC 样本用于训练和验证队列(模型构建和内部验证)。在随机森林(RF)和极端梯度提升(XGBoost)特征重要性排名模型中,年龄是最重要的变量,这也反映在列线图中。并且,将机器学习与 Cox 比例风险相结合的模型比仅使用 Cox 比例风险的模型具有更好的预测性能(RF 中的 AUC=0.762,XGBoost 中的 AUC=0.737,COX 中的 AUC=0.722)。校准曲线和决策曲线分析(DCA)曲线也表明我们的列线图具有出色的临床实用性。基于网络的动态列线图计算器可在 https://httseer.shinyapps.io/DynNomapp/ 上访问。系统地回顾了肺癌后 SPC 的预后特征。我们构建的动态列线图可以提供生存预测,以帮助临床医生做出个体化决策。