Department of Emergency, Nanjing Jiangning Hospital, Nanjing, Jiangsu, China.
Hefei Cancer Hospital, Anhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Institutes of Physical Science, Chinese Academy of Sciences, Hefei, China.
J Gene Med. 2024 Feb;26(2):e3673. doi: 10.1002/jgm.3673.
Breast cancer (BC), a malignant tumor, is a significant cause of death and disability among women globally. Recent research indicates that copy number variation plays a crucial role in tumor development. In this study, we employed the Single-Cell Variational Aneuploidy Analysis (SCEVAN) algorithm to differentiate between malignant and non-malignant cells, aiming to identify genetic signatures with prognostic relevance for predicting patient survival.
We analyzed gene expression profiles and associated clinical data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. Using the SCEVAN algorithm, we distinguished malignant from non-malignant cells and investigated cellular interactions within the tumor microenvironment (TME). We categorized TCGA samples based on differentially expressed genes (DEGs) between these cell types. Subsequent Kyoto Encyclopedia of Genes and Genomes pathway analysis was conducted. Additionally, we developed polygenic models for the DEGs using least absolute shrinkage and selection operator-penalized Cox regression analysis. To assess the prognostic accuracy of these characteristics, we generated Kaplan-Meier and receiver operating characteristic curves from training and validation datasets. We also monitored the expression variations of prognostic genes across the pseudotime of malignant cells. Patients were divided into high-risk and low-risk groups based on median risk scores to compare their TME and identify potential therapeutic agents. Lastly, polymerase chain reaction was used to validate seven pivotal genes.
The SCEVAN algorithm identified distinct malignant and non-malignant cells in GSE180286. Cellchat analysis revealed significantly increased cellular communication, particularly between fibroblasts, endothelial cells and malignant cells. The DEGs were predominantly involved in immune-related pathways. TCGA samples were classified into clusters A and B based on these genes. Cluster A, enriched in immune pathways, was associated with poorer prognosis, whereas cluster B, predominantly involved in circadian rhythm pathways, showed better outcomes. We constructed a 14-gene prognostic signature, validated in a 1:1 internal TCGA cohort and external GEO datasets (GSE42568 and GSE146558). Kaplan-Meier analysis confirmed the prognostic signature's accuracy (p < 0.001). Receiver operating characteristic curve analysis demonstrated the predictive reliability of these prognostic features. Single-cell pseudotime analysis with monocle2 highlighted the distinct expression trends of these genes in malignant cells, underscoring the intratumoral heterogeneity. Furthermore, we explored the differences in TME between high- and low-risk groups and identified 16 significantly correlated drugs.
Our findings suggest that the 14-gene prognostic signature could serve as a novel biomarker for forecasting the prognosis of BC patients. Additionally, the immune cells and pathways in different risk groups indicate that immunotherapy may be a crucial component of treatment strategies for BC patients.
乳腺癌(BC)是一种恶性肿瘤,是全球女性死亡和残疾的重要原因。最近的研究表明,拷贝数变异在肿瘤发展中起着关键作用。在这项研究中,我们使用单细胞变异倍性分析(SCEVAN)算法来区分恶性和非恶性细胞,旨在识别具有预测患者生存预后相关性的遗传特征。
我们分析了来自基因表达综合数据库(GEO)和癌症基因组图谱(TCGA)数据库的基因表达谱和相关临床数据。使用 SCEVAN 算法,我们区分了恶性和非恶性细胞,并研究了肿瘤微环境(TME)内的细胞相互作用。我们根据这些细胞类型之间差异表达的基因(DEGs)对 TCGA 样本进行分类。随后进行京都基因与基因组百科全书通路分析。此外,我们使用最小绝对收缩和选择算子惩罚 Cox 回归分析为 DEGs 开发了多基因模型。为了评估这些特征的预后准确性,我们从训练和验证数据集中生成 Kaplan-Meier 和接收器操作特征曲线。我们还监测了预后基因在恶性细胞伪时间上的表达变化。根据中位数风险评分将患者分为高风险和低风险组,以比较他们的 TME 并确定潜在的治疗药物。最后,使用聚合酶链反应验证了七个关键基因。
SCEVAN 算法在 GSE180286 中鉴定出了明显的恶性和非恶性细胞。Cellchat 分析显示,细胞间通讯显著增加,特别是成纤维细胞、内皮细胞和恶性细胞之间的通讯。DEGs 主要参与免疫相关途径。基于这些基因,TCGA 样本被分为聚类 A 和 B。富含免疫途径的聚类 A 与预后较差相关,而主要涉及昼夜节律途径的聚类 B 则具有较好的预后。我们构建了一个由 14 个基因组成的预后标志物,在内部 TCGA 队列和外部 GEO 数据集(GSE42568 和 GSE146558)中进行了验证。Kaplan-Meier 分析证实了该预后标志物的准确性(p<0.001)。接收器操作特征曲线分析表明了这些预后特征的预测可靠性。使用 monocle2 的单细胞伪时间分析突出了这些基因在恶性细胞中的独特表达趋势,强调了肿瘤内异质性。此外,我们还研究了高低风险组之间 TME 的差异,并鉴定出 16 种具有显著相关性的药物。
我们的研究结果表明,14 个基因的预后标志物可作为预测 BC 患者预后的新型生物标志物。此外,不同风险组的免疫细胞和途径表明,免疫疗法可能是 BC 患者治疗策略的重要组成部分。