Zhang Lemeng, Chen Jianhua, Yang Hua, Pan Changqie, Li Haitao, Luo Yongzhong, Cheng Tianli
Thoracic Medicine Department 1, Hunan Cancer Hospital, Changsha, Hunan Province, P.R. China, 410013.
J Cancer. 2021 Jan 1;12(4):996-1010. doi: 10.7150/jca.51264. eCollection 2021.
Chronic obstructive pulmonary disease (COPD) is an independent risk factor of non-small cell lung cancer (NSCLC). This study aimed to analyze the key genes and potential molecular mechanisms that are involved in the development from COPD to NSCLC. Expression profiles of COPD and NSCLC in GSE106899, GSE12472, and GSE12428 were downloaded from the Gene Expression Omnibus (GEO) database, followed by identification of the differentially expressed genes (DEGs) between COPD and NSCLC. Based on the identified DEGs, functional pathway enrichment and lung carcinogenesis-related networks analyses were performed and further visualized with Cytoscape software. Then, principal component analysis (PCA), cluster analysis, and support vector machines (SVM) verified the ability of the top modular genes to distinguish COPD from NSCLC. Additionally, the corrections between these key genes and clinical staging of NSCLC were studied using the UALCAN and HPA websites. Finally, a prognostic risk model was constructed based on multivariate Cox regression analysis. Kaplan-Meier survival curves of the top modular genes on the training and verification sets were generated. A total of 2350, 1914, and 1850 DEGs were obtained from GSE106899, GSE12472, and GSE12428 datasets, respectively. Following analysis of protein-protein interaction networks, the identified modular gene signatures containing H2AFX, MCM2, MCM3, MCM7, POLD1, and RPA1 were identified as markers for discrimination between COPD and NSCLC. The modular gene signatures were mainly enriched in the processes of DNA replication, cell cycle, mismatch repair, and others. Besides, the expression levels of these genes were significantly higher in NSCLC than in COPD, which was further verified by the immunohistochemistry. In addition, the high expression levels of H2AFX, MCM2, MCM7, and POLD1 correlate with poor prognosis of lung adenocarcinoma (LUAD). The Cox regression prognostic risk model showed the similar results and the predictive ability of this model is independent of other clinical variables. This study revealed several key modules that closely relate to NSCLC with underlying disease COPD, which provide a deeper understanding of the potential mechanisms underlying the malignant development from COPD to NSCLC. This study provides valuable prognostic factors in high-risk lung cancer patients with COPD.
慢性阻塞性肺疾病(COPD)是非小细胞肺癌(NSCLC)的独立危险因素。本研究旨在分析参与从COPD发展到NSCLC过程的关键基因和潜在分子机制。从基因表达综合数据库(GEO)下载GSE106899、GSE12472和GSE12428中COPD和NSCLC的表达谱,随后鉴定COPD和NSCLC之间的差异表达基因(DEG)。基于鉴定出的DEG,进行功能通路富集和肺癌发生相关网络分析,并用Cytoscape软件进一步可视化。然后,主成分分析(PCA)、聚类分析和支持向量机(SVM)验证了顶级模块基因区分COPD和NSCLC的能力。此外,使用UALCAN和HPA网站研究了这些关键基因与NSCLC临床分期之间的相关性。最后,基于多变量Cox回归分析构建预后风险模型。生成了训练集和验证集上顶级模块基因的Kaplan-Meier生存曲线。分别从GSE106899、GSE12472和GSE12428数据集中获得了2350、1914和1850个DEG。在分析蛋白质-蛋白质相互作用网络后,鉴定出包含H2AFX、MCM2、MCM3、MCM7、POLD1和RPA1的模块基因特征作为区分COPD和NSCLC的标志物。这些模块基因特征主要富集于DNA复制、细胞周期、错配修复等过程。此外,这些基因在NSCLC中的表达水平显著高于COPD,免疫组织化学进一步验证了这一点。此外,H2AFX、MCM2、MCM7和POLD1的高表达水平与肺腺癌(LUAD)的不良预后相关。Cox回归预后风险模型显示了相似的结果,且该模型的预测能力独立于其他临床变量。本研究揭示了几个与合并潜在疾病COPD的NSCLC密切相关的关键模块,这为深入了解从COPD到NSCLC恶性发展的潜在机制提供了帮助。本研究为COPD高危肺癌患者提供了有价值的预后因素。