Second Department of Surgery, Naval Hospital of Eastern Theater, Zhoushan 316000, Zhejiang, China.
Comb Chem High Throughput Screen. 2022;25(6):998-1004. doi: 10.2174/1386207324666210309100923.
The objective of this study is to construct a prognostic model using genetic markers of liver cancer and explore the signature genes associated with the tumor immune microenvironment.
Cox proportional hazards regression analysis was carried out to screen the significant HR using the dataset of TCGA Liver Cancer (LIHC) gene expression data. Then LASSO (least absolute shrinkage and selection operator) was performed to select the minimal variables with significant HR of genes. Thus, the prognostic model was constructed by the minimal variables with their HR. Time-dependent receiver-operating characteristic (ROC) curve and area under the ROC curve (AUC) value was used to assess the prognostic performance. Then the patients were divided into high and low-risk groups by the median of the model. Survival analysis was performed on the two groups with testing and an independent dataset. Furthermore, enrichment analysis of signature mRNAs and lncRNAs and their co-expression genes was performed. Then, Spearman rank correlation was used to calculate the correlation between immune cells and genes in the prognostic model, and abundance difference of the immune cells in high and low risks groups was tested.
A total of 5989 genes with significant HR were identified. 6 key genes (three mRNAs: DHX37, SMIM7, and MFSD1, three lncRNAs: PIWIL4, KCNE5, and LOC100128398) screened by LASSO were used to construct the model with their HR value respectively. The AUC values of 1 and 5-year overall survival were 0.78 and 0.76 in discovery data and 0.67 and 0.68 in testing data. Survival analysis performed significantly discriminated high and low groups with testing and independent data. Furthermore, many immune cells such as nTreg found a significant correlation with the genes in the prognostic model, and many immune cells showed significantly different abundance in high and low-risk groups.
In the study, we used Univariate Cox analyses and LASSO algorithm with TCGA gene expression data to construct the prognostic model in liver cancer patients. The prognostic model comprised of three mRNAs, including DHX37, SMIM7, MFSD1, and three lncRNAs, including PIWIL4, KCNE5, and LOC100128398. Furthermore, these gene expression levels were associated with the abundance of some immune cells, such as nTreg. Also, many immune cells have significantly different abundance in high and low-risk groups. All these results indicated that the combination with all these six genes could be the potential biomarker for the prognosis of liver cancer.
本研究旨在构建基于肝癌遗传标志物的预后模型,并探讨与肿瘤免疫微环境相关的特征基因。
采用 TCGA 肝癌(LIHC)基因表达数据进行 Cox 比例风险回归分析,筛选有显著 HR 的基因。然后,通过 LASSO(最小绝对收缩和选择算子)筛选出 HR 有意义的最小变量。因此,通过最小变量及其 HR 构建预后模型。采用时间依赖性接收者操作特征(ROC)曲线和 ROC 曲线下面积(AUC)值评估预后性能。然后,根据模型的中位数将患者分为高风险和低风险组。对两组进行生存分析,并使用独立数据集进行测试。此外,对特征 mRNA 和 lncRNA 及其共表达基因进行富集分析。然后,使用 Spearman 秩相关计算预后模型中免疫细胞与基因的相关性,并检测高风险和低风险组之间免疫细胞的丰度差异。
共筛选出 5989 个具有显著 HR 的基因。通过 LASSO 筛选出的 6 个关键基因(三个 mRNA:DHX37、SMIM7 和 MFSD1,三个 lncRNA:PIWIL4、KCNE5 和 LOC100128398)分别用其 HR 值构建模型。发现模型在发现数据中的 1 年和 5 年总生存率 AUC 值分别为 0.78 和 0.76,在测试数据中的 AUC 值分别为 0.67 和 0.68。生存分析在测试和独立数据中显著区分了高风险和低风险组。此外,许多免疫细胞,如 nTreg,与预后模型中的基因有显著相关性,许多免疫细胞在高风险和低风险组之间表现出显著的丰度差异。
本研究利用 TCGA 基因表达数据进行单因素 Cox 分析和 LASSO 算法构建肝癌患者的预后模型。该预后模型由三个 mRNA(DHX37、SMIM7 和 MFSD1)和三个 lncRNA(PIWIL4、KCNE5 和 LOC100128398)组成。此外,这些基因表达水平与一些免疫细胞(如 nTreg)的丰度有关。此外,许多免疫细胞在高风险和低风险组之间的丰度有显著差异。所有这些结果表明,将这六个基因组合起来可能成为肝癌预后的潜在生物标志物。