Fang Yutong, Zheng Rongji, Xiao Yefeng, Zhang Qunchen, Liu Junpeng, Wu Jundong
Department of Breast Surgery, Cancer Hospital of Shantou University Medical College, Shantou, Guangdong, China.
Department of Breast Surgery, Jiangmen Central Hospital, Jiangmen, Guangdong, China.
Front Immunol. 2025 May 27;16:1581982. doi: 10.3389/fimmu.2025.1581982. eCollection 2025.
Breast cancer (BC) remains a leading cause of cancer-related mortality among women worldwide. Natural killer (NK) cells play a crucial role in the innate immune system and exhibit significant anti-tumor activity. However, the role of NK cell-related genes (NRGs) in BC diagnosis and prognosis remains underexplored. With the advent of machine learning (ML) techniques, predictive modeling based on NRGs may offer a new avenue for precision oncology.
We collected transcriptomic and clinical data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differentially expressed genes (DEGs) were identified, and key prognostic NRGs were selected using univariate and multivariate Cox regression analyses. We constructed ML-based diagnostic models using 12 algorithms and evaluated their performance for identifying the optimal ML diagnostic model. Additionally, a prognostic risk model was developed using LASSO-Cox regression, and its performance was validated in independent cohorts. To explore the potential mechanisms underlying the prognostic differences between high-risk and low-risk patient groups, as well as their drug treatment sensitivities, we conducted functional enrichment analysis, tumor microenvironment analysis, immunotherapy prediction, drug sensitivity analysis, and mutation analysis.
ULBP2, CCL5, PRDX1, IL21, NFATC2, CD2, and VAV3 were identified as key NRGs for the construction of ML models. Among the 12 ML diagnostic models, the Random Forest (RF) model demonstrated the best performance, which demonstrated robust performance in distinguishing BC from normal tissues in both training (TCGA) and validation (GEO) cohorts. In terms of the prognostic model, the risk score based on LASSO-Cox regression effectively distinguished between high-risk and low-risk patients, with patients in the high-risk group exhibiting significantly poorer overall survival (OS) compared to those in the low-risk group, and was validated in the GEO cohorts. Patients in the high-risk group displayed increased tumor proliferation, immune evasion, and reduced immune cell infiltration, correlating with poorer prognosis and lower response rates to immunotherapy. Furthermore, drug sensitivity analysis indicated that high-risk patients were more sensitive to Thapsigargin, Docetaxel, AKT inhibitor VIII, Pyrimethamine, and Epothilone B, while showing higher resistance to drugs such as I-BET-762, PHA-665752, and Belinostat.
This study provides a comprehensive analysis of NRGs in BC and establishes reliable ML-based diagnostic and prognostic models. The findings highlight the clinical relevance of NRGs in BC progression, immune regulation, and therapy response, offering potential targets for personalized treatment strategies.
乳腺癌(BC)仍是全球女性癌症相关死亡的主要原因。自然杀伤(NK)细胞在先天免疫系统中发挥关键作用,并表现出显著的抗肿瘤活性。然而,NK细胞相关基因(NRGs)在BC诊断和预后中的作用仍未得到充分探索。随着机器学习(ML)技术的出现,基于NRGs的预测模型可能为精准肿瘤学提供一条新途径。
我们从癌症基因组图谱(TCGA)和基因表达综合数据库(GEO)收集了转录组和临床数据。识别差异表达基因(DEGs),并使用单变量和多变量Cox回归分析选择关键的预后NRGs。我们使用12种算法构建基于ML的诊断模型,并评估它们识别最佳ML诊断模型的性能。此外,使用LASSO-Cox回归开发了一种预后风险模型,并在独立队列中验证其性能。为了探索高危和低危患者组预后差异的潜在机制及其药物治疗敏感性,我们进行了功能富集分析、肿瘤微环境分析、免疫治疗预测、药物敏感性分析和突变分析。
ULBP2、CCL5、PRDX1、IL21、NFATC2、CD2和VAV3被确定为构建ML模型的关键NRGs。在12种ML诊断模型中,随机森林(RF)模型表现最佳,在训练(TCGA)和验证(GEO)队列中,其在区分BC与正常组织方面均表现出强大性能。在预后模型方面,基于LASSO-Cox回归的风险评分有效地区分了高危和低危患者,高危组患者的总生存期(OS)明显低于低危组患者,并在GEO队列中得到验证。高危组患者表现出肿瘤增殖增加、免疫逃逸以及免疫细胞浸润减少,这与较差的预后和较低的免疫治疗反应率相关。此外,药物敏感性分析表明,高危患者对毒胡萝卜素、多西他赛、AKT抑制剂VIII、乙胺嘧啶和埃博霉素B更敏感,而对I-BET-762、PHA-665752和贝利司他等药物表现出更高的耐药性。
本研究对BC中的NRGs进行了全面分析,并建立了可靠的基于ML的诊断和预后模型。研究结果突出了NRGs在BC进展、免疫调节和治疗反应中的临床相关性,为个性化治疗策略提供了潜在靶点。