Xu Yong, Wang Yao, Liang Leilei, Song Nan
Department of Thoracic Surgery, Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, China.
Front Genet. 2022 Aug 15;13:975542. doi: 10.3389/fgene.2022.975542. eCollection 2022.
Single-cell RNA sequencing is necessary to understand tumor heterogeneity, and the cell type heterogeneity of lung adenocarcinoma (LUAD) has not been fully studied. We first reduced the dimensionality of the GSE149655 single-cell data. Then, we statistically analysed the subpopulations obtained by cell annotation to find the subpopulations highly enriched in tumor tissues. Monocle was used to predict the development trajectory of five subpopulations; beam was used to find the regulatory genes of five branches; qval was used to screen the key genes; and cellchart was used to analyse cell communication. Next, we used the differentially expressed genes of TCGA-LUAD to screen for overlapping genes and established a prognostic risk model through univariate and multivariate analyses. To identify the independence of the model in clinical application, univariate and multivariate Cox regression were used to analyse the relevant HR, 95% CI of HR and value. Finally, the novel biomarker genes were verified by qPCR and immunohistochemistry. The single-cell dataset GSE149655 was subjected to quality control, filtration and dimensionality reduction. Finally, 23 subpopulations were screened, and 11-cell subgroups were annotated in 23 subpopulations. Through the statistical analysis of 11 subgroups, five important subgroups were selected, including lung epithelial cells, macrophages, neuroendocrine cells, secret cells and T cells. From the analysis of cell trajectory and cell communication, it is found that the interaction of five subpopulations is very complex and that the communication between them is dense. We believe that these five subpopulations play a very important role in the occurrence and development of LUAD. Downloading the TCGA data, we screened the marker genes of these five subpopulations, which are also the differentially expressed genes in tumorigenesis, with a total of 462 genes, and constructed 10 gene prognostic risk models based on related genes. The 10-gene signature has strong robustness and can achieve stable prediction efficiency in datasets from different platforms. Two new molecular markers related to LUAD, HLA-DRB5 and CCDC50, were verified by qPCR and immunohistochemistry. The results showed that expression was negatively correlated with the risk of LUAD, and CCDC50 expression was positively correlated with the risk of LUAD. Therefore, we identified a prognostic risk model including CCL20, CP, HLA-DRB5, RHOV, CYP4B1, BASP1, ACSL4, GNG7, CCDC50 and SPATS2 as risk biomarkers and verified their predictive value for the prognosis of LUAD, which could serve as a new therapeutic target.
单细胞RNA测序对于理解肿瘤异质性是必要的,而肺腺癌(LUAD)的细胞类型异质性尚未得到充分研究。我们首先对GSE149655单细胞数据进行降维。然后,我们对通过细胞注释获得的亚群进行统计分析,以找到在肿瘤组织中高度富集的亚群。使用Monocle预测五个亚群的发育轨迹;使用beam找到五个分支的调控基因;使用qval筛选关键基因;使用cellchart分析细胞通讯。接下来,我们使用TCGA-LUAD的差异表达基因筛选重叠基因,并通过单变量和多变量分析建立预后风险模型。为了确定该模型在临床应用中的独立性,使用单变量和多变量Cox回归分析相关的HR、HR的95%置信区间和P值。最后,通过qPCR和免疫组织化学验证了新的生物标志物基因。对单细胞数据集GSE149655进行了质量控制、过滤和降维。最后,筛选出23个亚群,并在23个亚群中注释了11个细胞亚组。通过对11个亚组的统计分析,选择了五个重要亚组,包括肺上皮细胞、巨噬细胞、神经内分泌细胞、分泌细胞和T细胞。从细胞轨迹和细胞通讯分析中发现,五个亚群之间的相互作用非常复杂,它们之间的通讯密集。我们认为这五个亚群在LUAD的发生和发展中起着非常重要的作用。下载TCGA数据后,我们筛选了这五个亚群的标记基因,这些基因也是肿瘤发生中的差异表达基因,共有462个基因,并基于相关基因构建了10个基因的预后风险模型。该10基因特征具有很强的稳健性,并且可以在来自不同平台的数据集中实现稳定的预测效率。通过qPCR和免疫组织化学验证了两个与LUAD相关的新分子标记HLA-DRB5和CCDC50。结果表明,HLA-DRB5表达与LUAD风险呈负相关,CCDC50表达与LUAD风险呈正相关。因此,我们确定了一个包括CCL20、CP、HLA-DRB5、RHOV、CYP4B1、BASP1、ACSL4、GNG7、CCDC50和SPATS2的预后风险模型作为风险生物标志物,并验证了它们对LUAD预后的预测价值,这可以作为一个新的治疗靶点。