Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran.
Physiology Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences, Kerman, Iran.
J Cancer Res Ther. 2023 Apr;19(Supplement):S126-S137. doi: 10.4103/jcrt.jcrt_811_21.
Breast cancer (BC) is the most common cancer and the fifth cause of death in women worldwide. Exploring unique genes for cancers has been interesting.
This study aimed to explore unique genes of five molecular subtypes of BC in women using penalized logistic regression models. For this purpose, microarray data of five independent GEO data sets were combined. This combination includes genetic information of 324 women with BC and 12 healthy women. Least absolute shrinkage and selection operator (LASSO) logistic regression and adaptive LASSO logistic regression were used to extract unique genes. The biological process of extracted genes was evaluated in an open-source GOnet web application. R software version 3.6.0 with the glmnet package was used for fitting the models.
Totally, 119 genes were extracted among 15 pairwise comparisons. Seventeen genes (14%) showed overlap between comparative groups. According to GO enrichment analysis, the biological process of extracted genes was enriched in negative and positive regulation biological processes, and molecular function tracking revealed that most genes are involved in kinase and transferring activities. On the other hand, we identified unique genes for each comparative group and the subsequent pathways for them. However, a significant pathway was not identified for genes in normal-like versus ERBB2 and luminal A, basal versus control, and lumina B versus luminal A groups.
Most genes selected by LASSO logistic regression and adaptive LASSO logistic regression identified unique genes and related pathways for comparative subgroups of BC, which would be useful to comprehend the molecular differences between subgroups that would be considered for further research and therapeutic approaches in the future.
乳腺癌(BC)是全球最常见的癌症,也是女性死亡的第五大原因。探索癌症的独特基因一直很有趣。
本研究旨在使用惩罚逻辑回归模型探索女性五种分子亚型 BC 的独特基因。为此,组合了五个独立 GEO 数据集的微阵列数据。该组合包括 324 名 BC 女性和 12 名健康女性的遗传信息。最小绝对收缩和选择算子(LASSO)逻辑回归和自适应 LASSO 逻辑回归用于提取独特基因。在开源 GOnet 网络应用程序中评估提取基因的生物学过程。使用 R 软件版本 3.6.0 和 glmnet 包拟合模型。
在 15 对比较中总共提取了 119 个基因。17 个基因(14%)在比较组之间存在重叠。根据 GO 富集分析,提取基因的生物学过程富集在负和正调节的生物学过程中,分子功能跟踪表明,大多数基因参与激酶和转移活性。另一方面,我们为每个比较组确定了独特的基因及其随后的途径。然而,对于正常样与 ERBB2 以及 luminal A、基底与对照和 lumina B 与 luminal A 组之间的基因,没有确定显著的途径。
LASSO 逻辑回归和自适应 LASSO 逻辑回归选择的大多数基因确定了 BC 比较亚组的独特基因和相关途径,这有助于理解亚组之间的分子差异,可考虑用于未来的进一步研究和治疗方法。