Xu Wenfang, Liu Zhenhao, Ren He, Peng Xueqing, Wu Aoshen, Ma Duan, Liu Gang, Liu Lei
Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences and Institutes of Biomedical Sciences, Fudan University, 200032, Shanghai, P.R.China.
J Cancer. 2020 Jan 1;11(2):441-449. doi: 10.7150/jca.30923. eCollection 2020.
Glioma, caused by carcinogenesis of brain and spinal glial cells, is the most common primary malignant brain tumor. To find the important indicator for glioma prognosis is still a challenge and the metabolic alteration of glioma has been frequently reported recently. In our current work, a risk score model based on the expression of twenty metabolic genes was developed using the metabolic gene expressions in The Cancer Genome Atlas (TCGA) dataset, the methods of which included the cox multivariate regression and the random forest variable hunting, a kind of machine learning algorithm, and the risk score generated from this model is used to make predictions in the survival of glioma patients in the training dataset. Subsequently, the result was further verified in other three verification sets (GSE4271, GSE4412 and GSE16011). Risk score related pathways collected in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database were identified using Gene Set Enrichment Analysis (GSEA). The risk score generated from our model makes good predictions in the survival of glioma patients in the training dataset and other three verification sets. By assessing the relationships between clinical indicators and the risk score, we found that the risk score was an independent and significant indicator for the prognosis of glioma patients. Simultaneously, we conducted a survival analysis of the patients who received chemotherapy and who did not, finding that the risk score was equally valid in both cases. And signaling pathways related to the genesis and development of multiple cancers were also identified. In summary, our risk score model is predictive for 967 glioma patients' survival from four independent datasets, and the risk score is a meaningful and independent parameter of the clinicopathological information.
胶质瘤是由脑和脊髓胶质细胞癌变引起的,是最常见的原发性恶性脑肿瘤。寻找胶质瘤预后的重要指标仍然是一项挑战,并且近年来胶质瘤的代谢改变屡有报道。在我们当前的工作中,利用癌症基因组图谱(TCGA)数据集中的代谢基因表达,开发了一种基于20个代谢基因表达的风险评分模型,其方法包括cox多变量回归和随机森林变量筛选(一种机器学习算法),该模型生成的风险评分用于对训练数据集中胶质瘤患者的生存情况进行预测。随后,在其他三个验证集(GSE4271、GSE4412和GSE16011)中进一步验证了结果。使用基因集富集分析(GSEA)确定京都基因与基因组百科全书(KEGG)数据库中收集的与风险评分相关的途径。我们模型生成的风险评分对训练数据集和其他三个验证集中胶质瘤患者的生存情况做出了良好的预测。通过评估临床指标与风险评分之间的关系,我们发现风险评分是胶质瘤患者预后的一个独立且重要的指标。同时,我们对接受化疗和未接受化疗的患者进行了生存分析,发现风险评分在两种情况下同样有效。并且还确定了与多种癌症发生和发展相关的信号通路。总之,我们的风险评分模型可预测来自四个独立数据集的967例胶质瘤患者的生存情况,并且风险评分是临床病理信息中有意义且独立的参数。