用于鉴定与HER2阳性和三阴型乳腺癌相关的预后关键基因及重要通路的连贯数据分析

Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types.

作者信息

Zakir Mahrukh, Saddiqa Alishbah, Sheikh Mawara, Zakir Lalarukh, Sami Fatima, Ahmad Faisal Sardar, Rauf Sadaf Abdul, Ali Iqra, Muneer Zahid, Alonazi Wadi B, Siddiqi Abdul Rauf

机构信息

Department of Biosciences, COMSATS University, Park Road Islamabad, Islamabad, Pakistan.

Pakistan Agriculture Research Council Islamabad, Islamabad, Pakistan.

出版信息

Sci Rep. 2025 Jul 2;15(1):23675. doi: 10.1038/s41598-025-94084-0.

DOI:10.1038/s41598-025-94084-0

PMID:40604083

Abstract

Breast cancer is the most prevalent and lethal form of cancer being the utmost common medical concern of women. Breast cancer etiology implicates numerous cellular protein receptors such as estrogen receptors (ER), progesterone receptors (PR), and human epidermal growth factor/receptor 2 (HER2) which turn on oncogenic cascade often attributed to certain genetic variations. Breast Cancer is thus classified into ER + /-, PR + /-, HER2 ± and Triple Negative types. This study seeks to build upon our current knowledge of HER2 + and TNBC BC types to discover novel patterns for diagnosis and prognosis. The study exploits wealth of HER2 + and TNBC transcriptome (RNA Seq) data to elucidate the key hub genes, their associated networks, pathways, stage-wise expression profile, role in prognosis and survival expectancy, and regulatory transcription factors. The study also employs machine learning models including support vector machine (SVM), XGBoost, Random Forest, k nearest neighbor (kNN), Naïve Bayes and Voting Classifier to distinguish between HER2 + and TNBC transcriptomes which is a key variable for early detection and choice of therapeutic alternatives. RNA Seq datasets consisting of 49 HER2 + and 44 TNBC breast tumor samples were retrieved and pre-processed. Differentially Expressed Genes (DEGs) along with their logFC and p-values were fetched. The KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses of DEGs were conducted on DAVID (the Database for Annotation, Visualization and Integrated Discovery) and interaction network was constructed through Cytoscape. Ten hub genes were obtained based on maximum clique centrality (MCC), maximum neighborhood component (MNC), degree, closeness and betweenness using cytoHubba which included ACTB, ATM, ESR1, GAPDH, HNRNPK, KRAS, MDM2, SIRT1, TP53, and H3F3C (H3-5). These hub genes were found to be associated with cell proliferation, invasion and migration. Transcription factors and association of the expression profile of these hub genes with survival expectancy was also determined. Among the ML models, SVM stood out, exhibiting classification success between HER2 + and TNBC transcriptomes with an accuracy of 90%. The findings of this study can therefore effectively aid in tracing the initial prognosis of BC and identify biomarkers for the personalized prevention, prediction, diagnosis, and treatment of BC.

摘要

乳腺癌是最常见且致命的癌症形式，是女性最为关注的医学问题。乳腺癌的病因涉及众多细胞蛋白受体，如雌激素受体（ER）、孕激素受体（PR）和人表皮生长因子/受体2（HER2），这些受体开启的致癌级联反应通常归因于某些基因变异。因此，乳腺癌被分为ER +/ -、PR +/ -、HER2 ±和三阴性类型。本研究旨在基于我们目前对HER2 +和三阴性乳腺癌（TNBC）类型的了解，发现新的诊断和预后模式。该研究利用丰富的HER2 +和TNBC转录组（RNA测序）数据，以阐明关键的枢纽基因、它们相关的网络、通路、分期表达谱、在预后和生存预期中的作用以及调控转录因子。该研究还采用了机器学习模型，包括支持向量机（SVM）、XGBoost、随机森林、k近邻（kNN）、朴素贝叶斯和投票分类器，以区分HER2 +和TNBC转录组，这是早期检测和选择治疗方案的关键变量。检索并预处理了由49个HER2 +和44个TNBC乳腺肿瘤样本组成的RNA测序数据集。获取了差异表达基因（DEG）及其logFC和p值。在DAVID（注释、可视化和综合发现数据库）上对DEG进行了KEGG（京都基因与基因组百科全书）和GO（基因本体论）分析，并通过Cytoscape构建了相互作用网络。使用cytoHubba基于最大团中心性（MCC）、最大邻域成分（MNC）、度、紧密性和中介性获得了10个枢纽基因，包括ACTB、ATM、ESR1、GAPDH、HNRNPK、KRAS、MDM2、SIRT1、TP53和H3F3C（H3 - 5）。发现这些枢纽基因与细胞增殖、侵袭和迁移有关。还确定了转录因子以及这些枢纽基因的表达谱与生存预期的关联。在机器学习模型中，SVM表现突出，在HER2 +和TNBC转录组之间的分类成功率达到90%。因此，本研究的结果可以有效地帮助追踪乳腺癌的初始预后，并识别用于乳腺癌个性化预防、预测诊断和治疗的生物标志物。

相似文献

Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types.用于鉴定与HER2阳性和三阴型乳腺癌相关的预后关键基因及重要通路的连贯数据分析

Sci Rep. 2025 Jul 2;15(1):23675. doi: 10.1038/s41598-025-94084-0.

Integrated proteomics and transcriptomics analysis reveals key regulatory genes between ER-positive/PR-positive and ER-positive/PR-negative breast cancer.整合蛋白质组学和转录组学分析揭示雌激素受体阳性/孕激素受体阳性与雌激素受体阳性/孕激素受体阴性乳腺癌之间的关键调控基因。

BMC Cancer. 2025 Jul 1;25(1):1048. doi: 10.1186/s12885-025-14451-y.

Network-based meta-analysis and confirmation of genes ATP1A2, FXYD1, and ADCY3 associated with cAMP signaling in breast tumors compared to corresponding normal marginal tissues.与相应正常边缘组织相比，基于网络的荟萃分析及对乳腺癌中与cAMP信号传导相关的ATP1A2、FXYD1和ADCY3基因的验证。

Cell Mol Biol (Noisy-le-grand). 2024 Nov 27;70(11):16-30. doi: 10.14715/cmb/2024.70.11.3.

On discovery of novel hub genes for ER+ and TN breast cancer types through RNA seq data analyses and classification models.通过 RNA 测序数据分析和分类模型发现 ER+ 和 TN 乳腺癌新型枢纽基因。

Sci Rep. 2024 Sep 6;14(1):20840. doi: 10.1038/s41598-024-69721-9.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Tumor Suppressor miRNA-based Signatures in Triple Negative Breast Cancer: A Study Based on Big Data Analysis of Gene Expression Omnibus (GEO) Datasets and Its Validation.三阴性乳腺癌中基于肿瘤抑制miRNA的特征：一项基于基因表达综合数据库（GEO）数据集大数据分析及其验证的研究

Asian Pac J Cancer Prev. 2025 Jun 1;26(6):2087-2095. doi: 10.31557/APJCP.2025.26.6.2087.

Global Transcriptional Complexity of Estrogen Receptor-Low Positive Breast Cancers in the Prospective Swedish Population-Based SCAN-B Cohort.前瞻性瑞典人群队列SCAN-B中雌激素受体低阳性乳腺癌的全球转录复杂性

Clin Cancer Res. 2025 Jul 1;31(13):2695-2709. doi: 10.1158/1078-0432.CCR-24-3435.

The Association between ER, PR, HER2, and ER-/PR+ Expression and Lung Cancer Subsequent in Breast Cancer Patients: A Retrospective Cohort Study Based on SEER Database.基于 SEER 数据库的回顾性队列研究：乳腺癌患者中 ER、PR、HER2 和 ER-/PR+ 表达与肺癌后续发生的相关性。

Breast J. 2023 Nov 11;2023:7028189. doi: 10.1155/2023/7028189. eCollection 2023.

Exploring the molecular mechanisms of comorbidity between thyroid cancer and breast cancer through multi-omics data.通过多组学数据探索甲状腺癌和乳腺癌共病的分子机制。

Sci Rep. 2025 Jul 2;15(1):23309. doi: 10.1038/s41598-025-06566-w.

Deciphering Shared Gene Signatures and Immune Infiltration Characteristics Between Gestational Diabetes Mellitus and Preeclampsia by Integrated Bioinformatics Analysis and Machine Learning.通过综合生物信息学分析和机器学习破译妊娠期糖尿病和子痫前期之间共享的基因特征及免疫浸润特征

Reprod Sci. 2025 May 15. doi: 10.1007/s43032-025-01847-1.

本文引用的文献

KEGG: biological systems database as a model of the real world.京都基因与基因组百科全书（KEGG）：作为现实世界模型的生物系统数据库。

Nucleic Acids Res. 2025 Jan 6;53(D1):D672-D677. doi: 10.1093/nar/gkae909.

Identification of modules and key genes associated with breast cancer subtypes through network analysis.通过网络分析鉴定与乳腺癌亚型相关的模块和关键基因。

Sci Rep. 2024 May 29;14(1):12350. doi: 10.1038/s41598-024-61908-4.

Integrated analysis of public datasets for the discovery and validation of survival-associated genes in solid tumors.整合公共数据集以发现和验证实体瘤中与生存相关的基因

Innovation (Camb). 2024 Apr 9;5(3):100625. doi: 10.1016/j.xinn.2024.100625. eCollection 2024 May 6.

Delving into the Heterogeneity of Different Breast Cancer Subtypes and the Prognostic Models Utilizing scRNA-Seq and Bulk RNA-Seq.深入研究利用 scRNA-Seq 和 Bulk RNA-Seq 的不同乳腺癌亚型的异质性和预后模型。

Int J Mol Sci. 2022 Sep 1;23(17):9936. doi: 10.3390/ijms23179936.

Potential value of PRKDC as a therapeutic target and prognostic biomarker in pan-cancer.PRKDC 在泛癌中的治疗靶点和预后生物标志物的潜在价值。

Medicine (Baltimore). 2022 Jul 8;101(27):e29628. doi: 10.1097/MD.0000000000029628.

UALCAN: An update to the integrated cancer data analysis platform.UALCAN：一个集成癌症数据分析平台的更新。

Neoplasia. 2022 Mar;25:18-27. doi: 10.1016/j.neo.2022.01.001. Epub 2022 Jan 22.

KRAS mutation: from undruggable to druggable in cancer.KRAS 突变：从不可用药到癌症的可用药。

Signal Transduct Target Ther. 2021 Nov 15;6(1):386. doi: 10.1038/s41392-021-00780-4.

Analysis of differentially expressed proteins between HER2 positive and triple negative breast cancer and their prognostic significance.分析 HER2 阳性和三阴性乳腺癌之间差异表达蛋白及其预后意义。

Ann Diagn Pathol. 2021 Dec;55:151834. doi: 10.1016/j.anndiagpath.2021.151834. Epub 2021 Sep 29.

Single Hormone Receptor-Positive Metaplastic Breast Cancer: Similar Outcome as Triple-Negative Subtype.单一激素受体阳性化生性乳腺癌：与三阴性亚型具有相似的结局。

Front Endocrinol (Lausanne). 2021 Apr 23;12:628939. doi: 10.3389/fendo.2021.628939. eCollection 2021.

Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr

Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于鉴定与HER2阳性和三阴型乳腺癌相关的预后关键基因及重要通路的连贯数据分析

Cohesive data analysis for the identification of prognostic hub genes and significant pathways associated with HER2 + and TN breast cancer types.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献