Chen Xiaodong, Duan Qiongyu, Xuan Ying, Sun Yunan, Wu Rong
Department of Medical Oncology, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China.
Medicine (Baltimore). 2017 Apr;96(17):e6736. doi: 10.1097/MD.0000000000006736.
We aimed to find some specific pathways that can be used to predict the stage of lung adenocarcinoma.RNA-Seq expression profile data and clinical data of lung adenocarcinoma (stage I [37], stage II 161], stage III [75], and stage IV [45]) were obtained from the TCGA dataset. The differentially expressed genes were merged, correlation coefficient matrix between genes was constructed with correlation analysis, and unsupervised clustering was carried out with hierarchical clustering method. The specific coexpression network in every stage was constructed with cytoscape software. Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis was performed with KOBAS database and Fisher exact test. Euclidean distance algorithm was used to calculate total deviation score. The diagnostic model was constructed with SVM algorithm.Eighteen specific genes were obtained by getting intersection of 4 group differentially expressed genes. Ten significantly enriched pathways were obtained. In the distribution map of 10 pathways score in different groups, degrees that sample groups deviated from the normal level were as follows: stage I < stage II < stage III < stage IV. The pathway score of 4 stages exhibited linear change in some pathways, and the score of 1 or 2 stages were significantly different from the rest stages in some pathways. There was significant difference between dead and alive for these pathways except thyroid hormone signaling pathway.Those 10 pathways are associated with the development of lung adenocarcinoma and may be able to predict different stages of it. Furthermore, these pathways except thyroid hormone signaling pathway may be able to predict the prognosis.
我们旨在寻找一些可用于预测肺腺癌分期的特定途径。从TCGA数据集中获取了肺腺癌(I期[37例]、II期[161例]、III期[75例]和IV期[45例])的RNA测序表达谱数据和临床数据。合并差异表达基因,通过相关性分析构建基因间的相关系数矩阵,并采用层次聚类方法进行无监督聚类。使用Cytoscape软件构建各阶段的特定共表达网络。利用KOBAS数据库和Fisher精确检验进行京都基因与基因组百科全书通路富集分析。采用欧几里得距离算法计算总偏差得分。使用支持向量机算法构建诊断模型。通过取4组差异表达基因的交集获得了18个特定基因。获得了10条显著富集的通路。在不同组10条通路得分的分布图中,样本组偏离正常水平的程度如下:I期<II期<III期<IV期。在某些通路中,4个阶段的通路得分呈现线性变化,在某些通路中,1或2个阶段的得分与其他阶段有显著差异。除甲状腺激素信号通路外,这些通路在死亡和存活之间存在显著差异。这10条通路与肺腺癌的发生发展相关,可能能够预测其不同分期。此外,除甲状腺激素信号通路外,这些通路可能能够预测预后。