Department of Oncology, The Affiliated Suzhou Municipal Hospital of Nanjing Medical University, Suzhou, Jiangsu 215001, P.R. China.
Int J Mol Med. 2020 Jul;46(1):252-264. doi: 10.3892/ijmm.2020.4590. Epub 2020 Apr 27.
Lung squamous cell carcinoma (LSCC) is one of the primary types of non‑small cell lung carcinoma, and patients with recurrent LSCC usually have a poor prognosis. The present study was conducted to build a risk score (RS) system for LSCC. Methylation data on LSCC (training set) and on head and neck squamous cell carcinoma (validation set 2) were obtained from The Cancer Genome Atlas database, and GSE39279 (validation set 1) was retrieved from the Gene Expression Omnibus database. Differentially methylated protein‑coding genes (DMGs)/long non‑coding RNAs (DM‑lncRNAs) between recurrence‑associated samples and nonrecurrence samples were screened out using the limma package, and their correlation analysis was conducted using the cor.test() function. Following identification of the optimal combinations of DMGs or DM‑lncRNAs using the penalized package in R, RS systems were built, and the system with optimal performance was selected. Using the rms package, a nomogram survival model was then constructed. For the differentially expressed genes (DEGs) between the high‑ and low‑risk groups, pathway enrichment analysis was performed by Gene Set Enrichment Analysis. There were 335 DMGs and DM‑lncRNAs in total. Following screening out of the top 10 genes (aldehyde dehydrogenase 7 family member A1, chromosome 8 open reading frame 48, cytokine‑like 1, heat shock protein 90 alpha family class A member 1, isovaleryl‑CoA dehydrogenase, phosphodiesterase 3A, PNMA family member 2, SAM domain, SH3 domain and nuclear localization signals 1, thyroid hormone receptor interactor 13 and zinc finger protein 878) and 6 top lncRNAs, RS systems were constructed. According to Kaplan‑Meier analysis, the DNA methylation level‑based RS system exhibited the best performance. In combination with independent clinical prognostic factors, a nomogram survival model was built and successfully predicted patient survival. Furthermore, 820 DEGs between the high‑ and low‑risk groups were identified, and 3 pathways were identified to be enriched in this gene set. The 10‑DMG methylation level‑based RS system and the nomogram survival model may be applied for predicting the outcomes of patients with LSCC.
肺鳞状细胞癌(LSCC)是一种非小细胞肺癌,复发性 LSCC 患者的预后通常较差。本研究旨在建立 LSCC 的风险评分(RS)系统。从癌症基因组图谱数据库中获得 LSCC(训练集)和头颈部鳞状细胞癌(验证集 2)的甲基化数据,并从基因表达综合数据库中检索 GSE39279(验证集 1)。使用 limma 包筛选出与复发相关样本和非复发样本之间差异甲基化的蛋白编码基因(DMGs)/长非编码 RNA(DM-lncRNAs),并使用 cor.test()函数进行相关性分析。使用 R 中的 penalized 包确定 DMGs 或 DM-lncRNAs 的最佳组合后,构建 RS 系统,并选择具有最佳性能的系统。然后使用 rms 包构建列线图生存模型。对于高低风险组之间的差异表达基因(DEGs),通过基因集富集分析进行通路富集分析。共有 335 个 DMGs 和 DM-lncRNAs。筛选出前 10 个基因(醛脱氢酶 7 家族成员 A1、染色体 8 开放阅读框 48、细胞因子样 1、热休克蛋白 90α家族 A 成员 1、异戊酰辅酶 A 脱氢酶、磷酸二酯酶 3A、PNMA 家族成员 2、SAM 结构域、SH3 结构域和核定位信号 1、甲状腺激素受体相互作用蛋白 13 和锌指蛋白 878)和 6 个顶级 lncRNAs 后,构建了 RS 系统。根据 Kaplan-Meier 分析,基于 DNA 甲基化水平的 RS 系统表现出最佳性能。结合独立的临床预后因素,构建了列线图生存模型,成功预测了患者的生存情况。此外,在高低风险组之间鉴定出 820 个 DEGs,并且在该基因集中鉴定出 3 个通路富集。基于 10-DMG 甲基化水平的 RS 系统和列线图生存模型可能用于预测 LSCC 患者的预后。