Department of Physiology, CHA University School of Medicine, Gyeonggi, Republic of Korea.
Catholic University School of Medicine, Secho-gu, Seoul, Republic of Korea.
PLoS One. 2018 Nov 12;13(11):e0207204. doi: 10.1371/journal.pone.0207204. eCollection 2018.
Lung cancer is the second most common cancer in the United States and the leading cause of mortality in cancer patients. Biomarkers predicting survival of patients with lung cancer have a profound effect on patient prognosis and treatment. However, predictive biomarkers for survival and their relevance for lung cancer are not been well known yet. The objective of this study was to perform machine learning with data from The Cancer Genome Atlas of patients with lung adenocarcinoma (LUAD) to find survival-specific gene mutations that could be used as survival-predicting biomarkers. To identify survival-specific mutations according to various clinical factors, four feature selection methods (information gain, chi-squared test, minimum redundancy maximum relevance, and correlation) were used. Extracted survival-specific mutations of LUAD were applied individually or as a group for Kaplan-Meier survival analysis. Mutations in MMRN2 and GMPPA were significantly associated with patient mortality while those in ZNF560 and SETX were associated with patient survival. Mutations in DNAJC2 and MMRN2 showed significant negative association with overall survival while mutations in ZNF560 showed significant positive association with overall survival. Mutations in MMRN2 showed significant negative association with disease-free survival while mutations in DRD3 and ZNF560 showed positive associated with disease-free survival. Mutations in DRD3, SETX, and ZNF560 showed significant positive association with survival in patients with LUAD while the opposite was true for mutations in DNAJC2, GMPPA, and MMRN2. These gene mutations were also found in other cohorts of LUAD, lung squamous cell carcinoma, and small cell lung cancer. In LUAD of Pan-Lung Cancer cohort, mutations in GMPPA, DNAJC2, and MMRN2 showed significant negative associations with survival of patients while mutations in DRD3 and SETX showed significant positive association with survival. In this study, machine learning was conducted to obtain information necessary to discover specific gene mutations associated with the survival of patients with LUAD. Mutations in the above six genes could predict survival rate and disease-free survival rate in patients with LUAD. Thus, they are important biomarker candidates for prognosis.
肺癌是美国第二大常见癌症,也是癌症患者死亡的主要原因。预测肺癌患者生存的生物标志物对患者的预后和治疗有深远的影响。然而,目前还不完全清楚生存相关的预测生物标志物及其与肺癌的相关性。本研究的目的是利用来自癌症基因组图谱的肺腺癌(LUAD)患者的数据进行机器学习,以找到可作为生存预测生物标志物的生存特异性基因突变。为了根据各种临床因素确定生存特异性突变,使用了四种特征选择方法(信息增益、卡方检验、最小冗余最大相关性和相关性)。LUAD 的提取生存特异性突变被单独或作为一组应用于 Kaplan-Meier 生存分析。在 MMRN2 和 GMPPA 中的突变与患者死亡率显著相关,而在 ZNF560 和 SETX 中的突变与患者生存相关。在 DNAJC2 和 MMRN2 中的突变与总生存期呈显著负相关,而在 ZNF560 中的突变与总生存期呈显著正相关。在 MMRN2 中的突变与无病生存期呈显著负相关,而在 DRD3 和 ZNF560 中的突变与无病生存期呈正相关。在 LUAD 患者中,DRD3、SETX 和 ZNF560 中的突变与生存呈显著正相关,而在 DNAJC2、GMPPA 和 MMRN2 中的突变则相反。这些基因突变也在其他 LUAD、肺鳞癌和小细胞肺癌队列中发现。在泛肺癌队列的 LUAD 中,在 GMPPA、DNAJC2 和 MMRN2 中的突变与患者的生存呈显著负相关,而在 DRD3 和 SETX 中的突变与生存呈显著正相关。在本研究中,进行了机器学习以获得发现与 LUAD 患者生存相关的特定基因突变所需的信息。上述六个基因中的突变可预测 LUAD 患者的生存率和无病生存率。因此,它们是预后的重要生物标志物候选物。