He Ru, Zuo Shuguang
Center for Translational Medicine, Huaihe Hospital of Henan University, Kaifeng, China.
Institute of Infection and Immunity, Huaihe Hospital of Henan University, Kaifeng, China.
Front Oncol. 2019 Jul 31;9:693. doi: 10.3389/fonc.2019.00693. eCollection 2019.
The current staging system is imprecise for prognostic prediction of early-stage non-small cell lung cancer (NSCLC). This study aimed to develop a robust prognostic signature for early-stage NSCLC, allowing classification of patients with a high risk of poor outcome and specific treatment decision. In the present study, a comprehensive genome-wide profiling analysis was conducted using a retrospective pool of early-stage NSCLC patient data from the previous datasets of Gene Expression Omnibus (GEO) including GSE31210, GSE37745, and GSE50081 and The Cancer Genome Atlas (TCGA). Cox proportional hazards models were implemented to determine the association between gene expression levels and overall patient survival in each dataset. The common genes among all datasets were selected as candidate prognostic genes. A risk score model was developed and validated using four independent datasets and the entire cohort. The Kaplan-Meier with log-rank test was used to assess survival difference. A univariate Cox proportional hazards regression analysis for each dataset showed that a total of 2280 genes in GSE31210, 762 genes in GSE37745, 871 genes in GSE50081, and 666 genes in TCGA were identified as candidate protective genes, while overall 2131 genes in GSE31210, 913 in GSE37745, 1107 in GSE50081, and 997 in TCGA were identified as candidate risky genes. There were 8 common genes associated with overall survival, including 7 mRNA and 1 lncRNA. By using the Step-wise multivariate Cox analysis, an 8-gene prognostic signature (CDCP1, HMMR, TPX2, CIRBP, HLF, KBTBD7, SEC24B-AS1, and SH2B1) for early-stage NSCLC was developed. Patients in the high-risk group had shorter overall survival than those in the low-risk group. Multivariate regression and stratified analysis suggested that the prognostic power of the 8-gene signature was independent of other clinical factors. Furthermore, the 8-gene signature achieved AUC values of 0.726, 0.701, 0.725 and 0.650 in GSE31210, GSE37745, GSE50081 and TCGA, respectively. Moreover, the combination of the 8-gene signature and the stage resulted to a better patient classification for survival prediction and treatment decision. This study developed a robust gene signature with great value for prognostic prediction in early-stage NSCLC, which may contribute to patient classification and personalized treatment decisions.
当前的分期系统在早期非小细胞肺癌(NSCLC)的预后预测方面并不精确。本研究旨在开发一种用于早期NSCLC的可靠预后特征,以便对预后不良风险高的患者进行分类并做出具体的治疗决策。在本研究中,我们使用了来自基因表达综合数据库(GEO)(包括GSE31210、GSE37745和GSE50081)以及癌症基因组图谱(TCGA)的早期NSCLC患者数据的回顾性汇总,进行了全面的全基因组分析。实施Cox比例风险模型以确定每个数据集中基因表达水平与患者总体生存之间的关联。选择所有数据集中的共同基因作为候选预后基因。使用四个独立数据集和整个队列开发并验证了一个风险评分模型。采用Kaplan-Meier对数秩检验评估生存差异。对每个数据集进行的单变量Cox比例风险回归分析表明,GSE31210中有2280个基因、GSE37745中有762个基因、GSE50081中有871个基因以及TCGA中有666个基因被鉴定为候选保护基因,而GSE31210中总体有2131个基因、GSE37745中有913个基因、GSE50081中有1107个基因以及TCGA中有997个基因被鉴定为候选风险基因。有8个与总体生存相关的共同基因,包括7个mRNA和1个lncRNA。通过逐步多变量Cox分析,开发了一种用于早期NSCLC的8基因预后特征(CDCP1、HMMR、TPX2、CIRBP、HLF、KBTBD7、SEC24B-AS1和SH2B1)。高风险组患者的总体生存期短于低风险组患者。多变量回归和分层分析表明,8基因特征的预后能力独立于其他临床因素。此外,8基因特征在GSE31210、GSE37745、GSE50081和TCGA中的AUC值分别为0.726、0.701、0.725和0.650。此外,8基因特征与分期的组合在生存预测和治疗决策方面能更好地对患者进行分类。本研究开发了一种对早期NSCLC预后预测具有重要价值的可靠基因特征,这可能有助于患者分类和个性化治疗决策。