Zhang Fangyuan, Yu Shicheng, Wu Pengjie, Liu Liansheng, Wei Dong, Li Shengwen
School of Clinical Medicine, Tsinghua University, Beijing, China.
Key Laboratory of Regenerative Biology of the Chinese Academy of Sciences and Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China.
Transl Androl Urol. 2021 Sep;10(9):3540-3554. doi: 10.21037/tau-21-581.
Clear cell renal cell carcinoma (ccRCC) is the most common malignant kidney tumor in adults. Single-cell transcriptome sequencing can provide accurate gene expression data of individual cells. Integrated single-cell and bulk transcriptome data from ccRCC samples provide comprehensive information, which allows the discovery of new understandings of ccRCC and the construction of a novel prognostic model for ccRCC patients.
Single-cell transcriptome sequencing data was preprocessed by using the Seurat package in R software. Principal component analysis (PCA) and the t-distributed stochastic neighbor embedding (t-SNE) algorithm were used to perform cluster classification. Two subtypes of cancer cells were identified, pseudotime trajectory analysis and gene ontology (GO) analysis were conducted with the monocle and clusterProfiler packages. Two novel cancer cell biomarkers were identified according to the single-cell sequencing and were confirmed by The Cancer Genome Atlas (TCGA) data. T cell-related marker genes according to single-cell sequencing were screened by a combination of Kaplan-Meier (KM) analysis, univariate Cox analysis, least absolute shrinkage and selection operator (Lasso) regression and multivariate Cox analysis of TCGA data. Four survival predicting genes were screened out to develop a risk score model. A nomogram consisting of the risk score and clinical information was constructed to predict the prognosis for ccRCC patients.
A total of 5,933 cells were included in the study after quality control. Fifteen cell clusters were classified by PCA and t-SNE algorithm. Two clusters of cancer cells with distinct differentiation status were identified. Besides, GO analysis revealed that biological processes were different between the two subgroups. Egl-9 family hypoxia-inducible factor 3 (EGLN3) and nucleolar protein 3 (NOL3) were specifically expressed in cancer cell clusters, bulk RNA sequencing data from TCGA confirmed their high expression in ccRCC tissues. GTSE1, CENPF, SMC2 and H2AFV were screened out and applied to the construction of risk score model. A nomogram was generated to predict prognosis of ccRCC by combing the risk score and clinical parameters.
We integrated single-cell and bulk transcriptome data from ccRCC in this study. Two subtypes of ccRCC cells with different biological characteristics and two potential biomarkers of ccRCC were discovered. A novel prognostic model was constructed for clinical application.
透明细胞肾细胞癌(ccRCC)是成人中最常见的恶性肾肿瘤。单细胞转录组测序可以提供单个细胞准确的基因表达数据。来自ccRCC样本的整合单细胞和批量转录组数据提供了全面的信息,这有助于发现对ccRCC的新认识,并为ccRCC患者构建新的预后模型。
使用R软件中的Seurat包对单细胞转录组测序数据进行预处理。主成分分析(PCA)和t分布随机邻域嵌入(t-SNE)算法用于进行聚类分类。鉴定出两种癌细胞亚型,使用monocle和clusterProfiler包进行伪时间轨迹分析和基因本体(GO)分析。根据单细胞测序鉴定出两种新型癌细胞生物标志物,并通过癌症基因组图谱(TCGA)数据进行确认。通过对TCGA数据进行Kaplan-Meier(KM)分析、单变量Cox分析、最小绝对收缩和选择算子(Lasso)回归以及多变量Cox分析相结合,筛选出根据单细胞测序的T细胞相关标记基因。筛选出四个生存预测基因以建立风险评分模型。构建了一个由风险评分和临床信息组成的列线图,以预测ccRCC患者的预后。
质量控制后,本研究共纳入5933个细胞。通过PCA和t-SNE算法将细胞分为15个簇。鉴定出两个具有不同分化状态的癌细胞簇。此外,GO分析显示两个亚组之间的生物学过程不同。Egl-9家族缺氧诱导因子3(EGLN3)和核仁蛋白3(NOL3)在癌细胞簇中特异性表达,来自TCGA的批量RNA测序数据证实它们在ccRCC组织中高表达。筛选出GTSE1、CENPF、SMC2和H2AFV并应用于风险评分模型的构建。通过结合风险评分和临床参数生成了一个列线图来预测ccRCC的预后。
在本研究中,我们整合了来自ccRCC的单细胞和批量转录组数据。发现了两种具有不同生物学特征的ccRCC细胞亚型和两种ccRCC潜在生物标志物。构建了一种新的预后模型用于临床应用。