Wang X J, Shen R F, Wang X, Wang Y R, Xiao T
State Key Laboratory of Molecular Oncology, Beijing Key Laboratory for Carcinogenesis and Cancer Prevention, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China.
Zhonghua Zhong Liu Za Zhi. 2020 May 23;42(5):396-402. doi: 10.3760/cma.j.cn112152-112152-20191115-00740.
To investigate the differential gene expression profiles of alpha-fetoprotein (AFP) high- and low-expressing hepatocellular carcinoma (HCC), and to provide a theoretical basis for the molecular mechanism and prognosis analysis of HCC. The transcriptome data and related clinical information from 368 HCC cases were obtained from the Cancer Gene Atlas (TCGA) public database. The samples were divided into AFP high expression (AFP(high)) group and low expression (AFP(low)) group according to the quartile of AFP mRNA expression, with 92 cases in each group. The differential gene analysis was carried out using the DEseq2 package in the R software. The functional and KEGG pathway enrichment analysis of the differential genes was performed using ClusterProfiler package. The protein-protein interaction network was constructed to screen hub genes using the String database and Cytoscape software. The single-sample GSEA analysis was performed to enrich and score signature gene sets using the GSVA package. And then RNAseq data and real-time quantitative polymerase chain reaction (RT-qPCR) were used for independent dataset validation and tissue validation. The clinical analysis showed that high expression of AFP was significantly associated with poor pathological differentiation and ethnicity (<0.05 for both). A total of 1 382 differential genes were obtained by bioinformatics analysis, of which 931 genes were up-regulated and 451 genes were down-regulated in AFP(high) group. GO enrichment analysis showed that the highly expressed genes were mainly correlated with the processes of appendage development, limb development, and skeletal system development, while lowly expressed genes were related to metabolic-related processes such as xenobiotic metabolism, steroid metabolism, and cellular response to xenobiotic stimuli. KEGG pathway enrichment analysis revealed that highly expressed genes were mainly involved in primary immunodeficiency, neuroactive ligand-receptor interaction, and cytokine-cytokine receptor interaction, while lowly expressed genes were mainly involved in retinol metabolism, chemical carcinogenesis, steroid hormone biosynthesis and other pathways. A prognostic related gene set that was consisted of AURKB, TTK, CENPA, UBE2C, HJURP, and KIF15 was identified. And the high expression of this gene set was related to the shorter recurrence-free survival and overall survival time in HCC patients, and its enrichment score was positively correlated with AFP expression (=0.475, <0.001). The validation results of RNAseq data were basically consistent with the TCGA data. The RT-qPCR results showed that AURKB, KIF15, and UBE2C were significantly overexpressed in HCC tissues with high AFP expression. Although the expression of AURKB, TTK, KIF15, and UBE2C was not related to recurrence-free survival and overall survival of HCC patients, there was a tendency that the patients with high AFP levels showed relatively shorter recurrence-free survival time and overall survival time. There is a large difference in gene expression profiles between AFP(high) and AFP(low) HCC. The prognostic signature may cooperate with AFP to promote the initiation and development of HCC. It also may explain the tumorigenesis in HCC with different AFP levels, and provide new clues for the prognosis of HCC.
研究甲胎蛋白(AFP)高表达和低表达的肝细胞癌(HCC)的差异基因表达谱,为HCC的分子机制及预后分析提供理论依据。从癌症基因组图谱(TCGA)公共数据库中获取368例HCC病例的转录组数据及相关临床信息。根据AFP mRNA表达四分位数将样本分为AFP高表达(AFP(high))组和低表达(AFP(low))组,每组92例。使用R软件中的DEseq2软件包进行差异基因分析。用ClusterProfiler软件包对差异基因进行功能和KEGG通路富集分析。利用String数据库和Cytoscape软件构建蛋白质-蛋白质相互作用网络以筛选枢纽基因。用GSVA软件包进行单样本基因集富集分析(GSEA)以富集和评分特征基因集。然后使用RNAseq数据和实时定量聚合酶链反应(RT-qPCR)进行独立数据集验证和组织验证。临床分析显示,AFP高表达与病理分化差和种族显著相关(两者均P<0.05)。通过生物信息学分析共获得1382个差异基因,其中AFP(high)组中931个基因上调,451个基因下调。基因本体(GO)富集分析显示,高表达基因主要与附属器发育、肢体发育和骨骼系统发育过程相关,而低表达基因与外源性物质代谢、类固醇代谢和细胞对外源性物质刺激的反应等代谢相关过程有关。KEGG通路富集分析显示,高表达基因主要参与原发性免疫缺陷、神经活性配体-受体相互作用和细胞因子-细胞因子受体相互作用,而低表达基因主要参与视黄醇代谢、化学致癌作用、类固醇激素生物合成等通路。鉴定出一个由AURKB、TTK、CENPA、UBE2C、HJURP和KIF15组成的预后相关基因集。该基因集高表达与HCC患者较短的无复发生存期和总生存期相关,其富集分数与AFP表达呈正相关(r=0.475,P<0.001)。RNAseq数据的验证结果与TCGA数据基本一致。RT-qPCR结果显示,AURKB、KIF15和UBE2C在AFP高表达的HCC组织中显著过表达。虽然AURKB、TTK、KIF15和UBE2C的表达与HCC患者的无复发生存期和总生存期无关,但AFP水平高的患者无复发生存期和总生存期有相对较短的趋势。AFP(high)和AFP(low) HCC之间的基因表达谱存在很大差异。该预后特征可能与AFP协同促进HCC的发生和发展。它也可能解释不同AFP水平HCC的肿瘤发生,并为HCC的预后提供新线索。