Lyu Feng, Gao Xianshu, Ma Mingwei, Xie Mu, Shang Shiyu, Ren Xueying, Liu Mingzhu, Chen Jiayan
Department of Radiation Oncology, Peking University First Hospital, Beijing 100034, China.
First Clinical Medical School, Hebei North University, Zhangjiakou 075000, China.
Diagnostics (Basel). 2023 Jun 7;13(12):1997. doi: 10.3390/diagnostics13121997.
Prostate cancer is a significant clinical issue, particularly for high Gleason score (GS) malignancy patients. Our study aimed to engineer and validate a risk model based on the profiles of high-GS PCa patients for early identification and the prediction of prognosis.
We conducted differential gene expression analysis on patient samples from The Cancer Genome Atlas (TCGA) and enriched our understanding of gene functions. Using the least absolute selection and shrinkage operator (LASSO) regression, we established a risk model and validated it using an independent dataset from the International Cancer Genome Consortium (ICGC). Clinical variables were incorporated into a nomogram to predict overall survival (OS), and machine learning was used to explore the risk factor characteristics' impact on PCa prognosis. Our prognostic model was confirmed using various databases, including single-cell RNA-sequencing datasets (scRNA-seq), the Cancer Cell Line Encyclopedia (CCLE), PCa cell lines, and tumor tissues.
We identified 83 differentially expressed genes (DEGs). Furthermore, WASIR1, KRTAP5-1, TLX1, KIF4A, and IQGAP3 were determined to be significant risk factors for OS and progression-free survival (PFS). Based on these five risk factors, we developed a risk model and nomogram for predicting OS and PFS, with a C-index of 0.823 (95% CI, 0.766-0.881) and a 10-year area under the curve (AUC) value of 0.788 (95% CI, 0.633-0.943). Additionally, the 3-year AUC was 0.759 when validating using ICGC. KRTAP5-1 and WASIR1 were found to be the most influential prognosis factors when using the optimized machine learning model. Finally, the established model was interrelated with immune cell infiltration, and the signals were found to be differentially expressed in PCa cells when using scRNA-seq datasets and tissues.
We engineered an original and novel prognostic model based on five gene signatures through TCGA and machine learning, providing new insights into the risk of scarification and survival prediction for PCa patients in clinical practice.
前列腺癌是一个重要的临床问题,尤其对于高 Gleason 评分(GS)的恶性肿瘤患者。我们的研究旨在构建并验证一种基于高 GS 前列腺癌患者特征的风险模型,用于早期识别和预后预测。
我们对来自癌症基因组图谱(TCGA)的患者样本进行了差异基因表达分析,并深化了对基因功能的理解。使用最小绝对收缩选择算子(LASSO)回归,我们建立了一个风险模型,并使用来自国际癌症基因组联盟(ICGC)的独立数据集对其进行验证。将临床变量纳入列线图以预测总生存期(OS),并使用机器学习来探究风险因素特征对前列腺癌预后的影响。我们的预后模型通过各种数据库得到了验证,包括单细胞 RNA 测序数据集(scRNA-seq)、癌症细胞系百科全书(CCLE)、前列腺癌细胞系和肿瘤组织。
我们鉴定出 83 个差异表达基因(DEG)。此外,WASIR1、KRTAP5-1、TLX1、KIF4A 和 IQGAP3 被确定为 OS 和无进展生存期(PFS)的显著风险因素。基于这五个风险因素,我们开发了一个用于预测 OS 和 PFS 的风险模型和列线图,C 指数为 0.823(95% CI,0.766 - 0.881),10 年曲线下面积(AUC)值为 0.788(95% CI,0.633 - 0.943)。此外,使用 ICGC 进行验证时,3 年 AUC 为 0.759。使用优化的机器学习模型时,发现 KRTAP5-1 和 WASIR1 是最具影响力的预后因素。最后,所建立的模型与免疫细胞浸润相关,并且在使用 scRNA-seq 数据集和组织时,发现这些信号在前列腺癌细胞中差异表达。
我们通过 TCGA 和机器学习构建了一种基于五个基因特征的原创性新型预后模型,为临床实践中前列腺癌患者的瘢痕化风险和生存预测提供了新的见解。