Department of Colorectal Surgery, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200092, P.R. China.
Int J Mol Med. 2018 Mar;41(3):1419-1426. doi: 10.3892/ijmm.2018.3359. Epub 2018 Jan 2.
Colorectal cancer (CRC) is one of the most common cancers and a major cause of mortality. The present study aimed to identify potential biomarkers for CRC metastasis and uncover the mechanisms underlying the etiology of the disease. The five datasets GSE68468, GSE62321, GSE22834, GSE14297 and GSE6988 were utilized in the study, all of which contained metastatic and non-metastatic CRC samples. Among them, three datasets were integrated via meta-analysis to identify the differentially expressed genes (DEGs) between the two types of samples. A protein-protein interaction (PPI) network was constructed for these DEGs. Candidate genes were then selected by the support vector machine (SVM) classifier based on the betweenness centrality (BC) algorithm. A CRC dataset from The Cancer Genome Atlas database was used to evaluate the accuracy of the SVM classifier. Pathway enrichment analysis was carried out for the SVM-classified gene signatures. In total, 358 DEGs were identified by meta‑analysis. The top ten nodes in the PPI network with the highest BC values were selected, including cAMP responsive element binding protein 1 (CREB1), cullin 7 (CUL7) and signal sequence receptor 3 (SSR3). The optimal SVM classification model was established, which was able to precisely distinguish between the metastatic and non-metastatic samples. Based on this SVM classifier, 40 signature genes were identified, which were mainly enriched in protein processing in endoplasmic reticulum (e.g., SSR3), AMPK signaling pathway (e.g., CREB1) and ubiquitin mediated proteolysis (e.g., FBXO2, CUL7 and UBE2D3) pathways. In conclusion, the SVM-classified genes, including CREB1, CUL7 and SSR3, precisely distinguished the metastatic CRC samples from the non-metastatic ones. These genes have the potential to be used as biomarkers for the prognosis of metastatic CRC.
结直肠癌(CRC)是最常见的癌症之一,也是导致死亡的主要原因。本研究旨在鉴定 CRC 转移的潜在生物标志物,并揭示疾病病因的机制。本研究使用了五个数据集 GSE68468、GSE62321、GSE22834、GSE14297 和 GSE6988,这些数据集均包含转移性和非转移性 CRC 样本。其中,通过荟萃分析整合了三个数据集,以鉴定两种类型样本之间的差异表达基因(DEGs)。构建了这些 DEGs 的蛋白质-蛋白质相互作用(PPI)网络。然后,基于介数中心度(BC)算法,通过支持向量机(SVM)分类器选择候选基因。使用来自癌症基因组图谱数据库的 CRC 数据集评估 SVM 分类器的准确性。对 SVM 分类基因特征进行了通路富集分析。通过荟萃分析共鉴定出 358 个 DEGs。选择 PPI 网络中 BC 值最高的前十个节点,包括 cAMP 反应元件结合蛋白 1(CREB1)、CUL7 和信号序列受体 3(SSR3)。建立了最优的 SVM 分类模型,能够精确区分转移性和非转移性样本。基于此 SVM 分类器,鉴定出 40 个特征基因,这些基因主要富集在内质网蛋白加工(如 SSR3)、AMPK 信号通路(如 CREB1)和泛素介导的蛋白水解(如 FBXO2、CUL7 和 UBE2D3)通路中。总之,SVM 分类基因,包括 CREB1、CUL7 和 SSR3,能够精确地区分转移性 CRC 样本和非转移性样本。这些基因有可能作为转移性 CRC 预后的生物标志物。