Tian Suyan
Division of Clinical Research, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China.
Center for Applied Statistical Research, School of Mathematics, Jilin University, Changchun, Jilin 130012, P.R. China.
Oncol Lett. 2017 Nov;14(5):5464-5470. doi: 10.3892/ol.2017.6835. Epub 2017 Aug 28.
Non-small cell lung cancer (NSCLC) is a leading cause of cancer-associated mortality worldwide. Adenocarcinoma (AC) and squamous cell carcinoma (SCC) are two primary histological subtypes of NSCLC, accounting for ~70% of lung cancer cases. Increasing evidence suggests that AC and SCC differ in the composition of genes and molecular characteristics. Previous research has focused on distinguishing AC from SCC or predicting the NSCLC patient survival rates using gene expression profiles, usually with the aid of a feature selection method. The present study conducted a pre-filtering to identify the genes that have significant expression values and a high connection with other genes in the gene network, and then used the radial coordinate visualization method to identify relevant genes. By applying the proposed procedure to NSCLC data, it was demonstrated that there is a clear segmentation between AC and SCC, however not between patients with a good prognosis and bad prognosis. The focus of discriminating AC and SCC differs from survival prediction and there are almost no overlaps between the two gene signatures. Overall, a supervised learning method is preferred and future studies aiming to identify prognostic gene signatures with an increased prediction efficiency are required.
非小细胞肺癌(NSCLC)是全球癌症相关死亡的主要原因。腺癌(AC)和鳞状细胞癌(SCC)是NSCLC的两种主要组织学亚型,约占肺癌病例的70%。越来越多的证据表明,AC和SCC在基因组成和分子特征方面存在差异。以往的研究主要集中于利用基因表达谱区分AC和SCC,或预测NSCLC患者的生存率,通常借助特征选择方法。本研究进行了预筛选,以识别在基因网络中具有显著表达值且与其他基因高度相关的基因,然后使用径向坐标可视化方法识别相关基因。通过将所提出的程序应用于NSCLC数据,结果表明AC和SCC之间存在明显的区分,但预后良好和预后不良的患者之间没有明显区分。区分AC和SCC的重点与生存预测不同,两种基因特征之间几乎没有重叠。总体而言,监督学习方法是首选,未来需要开展旨在提高预测效率以识别预后基因特征的研究。