Pan Yida, Zhang Hongyang, Zhang Mingming, Zhu Jie, Yu Jianghong, Wang Bangting, Qiu Jigang, Zhang Jun
Department of Digestive Diseases, Huashan Hospital, Fudan University, Shanghai 200040, P.R. China.
Department of Gastroenterology, Nanjing Drum Tower Hospital, Nanjing University, Nanjing 210008, P.R. China.
Oncol Lett. 2017 Dec;14(6):6724-6734. doi: 10.3892/ol.2017.7097. Epub 2017 Sep 28.
Colorectal cancer (CRC) is one of the most frequently occurring malignancies worldwide. The outcomes of patients with similar clinical symptoms or at similar pathological stages remain unpredictable. This inherent clinical diversity is most likely due to the genetic heterogeneity. The present study aimed to create a predicting tool to evaluate patient survival based on genetic profile. Firstly, three Gene Expression Omnibus (GEO) datasets (GSE9348, GSE44076 and GSE44861) were utilized to identify and validate differentially expressed genes (DEGs) in CRC. The GSE14333 dataset containing survival information was then introduced in order to screen and verify prognosis-associated genes. Of the 66 DEGs, the present study screened out 46 biomarkers closely associated to patient overall survival. By Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis, it was demonstrated that these genes participated in multiple biological processes which were highly associated with cancer proliferation, drug-resistance and metastasis, thus further affecting patient survival. The five most important genes, MET proto-oncogene, receptor tyrosine kinase, carboxypeptidase M, serine hydroxymethyltransferase 2, guanylate cyclase activator 2B and sodium voltage-gated channel a subunit 9 were selected by a random survival forests algorithm, and were further made up to a linear risk score formula by multivariable cox regression. Finally, the present study tested and verified this risk score within three independent GEO datasets (GSE14333, GSE17536 and GSE29621), and observed that patients with a high risk score had a lower overall survival (P<0.05). Furthermore, this risk score was the most significant compared with other predicting factors including age and American Joint Committee on Cancer stage, in the model, and was able to predict patient survival independently and directly. The findings suggest that this survival associated DEGs-based risk score is a powerful and accurate prognostic tool and is promisingly implemented in a clinical setting.
结直肠癌(CRC)是全球最常见的恶性肿瘤之一。具有相似临床症状或处于相似病理阶段的患者的预后仍然不可预测。这种固有的临床多样性很可能是由于基因异质性。本研究旨在创建一种基于基因谱评估患者生存的预测工具。首先,利用三个基因表达综合数据库(GEO)数据集(GSE9348、GSE44076和GSE44861)来识别和验证结直肠癌中差异表达基因(DEG)。然后引入包含生存信息的GSE14333数据集,以筛选和验证与预后相关的基因。在66个差异表达基因中,本研究筛选出46个与患者总生存密切相关的生物标志物。通过基因本体论和京都基因与基因组百科全书通路分析表明,这些基因参与了多个与癌症增殖、耐药性和转移高度相关的生物学过程,从而进一步影响患者生存。通过随机生存森林算法选择了五个最重要的基因,即原癌基因MET、受体酪氨酸激酶、羧肽酶M、丝氨酸羟甲基转移酶2、鸟苷酸环化酶激活剂2B和钠电压门控通道α亚基9,并通过多变量cox回归进一步构建成线性风险评分公式。最后,本研究在三个独立的GEO数据集(GSE14333、GSE17536和GSE29621)中对该风险评分进行了测试和验证,观察到高风险评分的患者总生存率较低(P<0.05)。此外,在该模型中,与包括年龄和美国癌症联合委员会分期在内的其他预测因素相比,该风险评分最为显著,并且能够独立、直接地预测患者生存。研究结果表明,这种基于与生存相关的差异表达基因的风险评分是一种强大而准确的预后工具,有望在临床环境中得到应用。