Tapak Leili, Hamidi Omid, Amini Payam, Afshar Saeid, Salimy Siamak, Dinu Irina
Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran.
Department of Science, Hamedan University of Technology, Hamedan, Iran.
Cancer Inform. 2023 Mar 21;22:11769351231157942. doi: 10.1177/11769351231157942. eCollection 2023.
Breast cancer (BC) has been reported as one of the most common cancers diagnosed in females throughout the world. Survival rate of BC patients is affected by metastasis. So, exploring its underlying mechanisms and identifying related biomarkers to monitor BC relapse/recurrence using new statistical methods is essential. This study investigated the high-dimensional gene-expression profiles of BC patients using penalized additive hazards regression models.
A publicly available dataset related to the time to metastasis in BC patients (GSE2034) was used. There was information of 22 283 genes expression profiles related to 286 BC patients. Penalized additive hazards regression models with different penalties, including LASSO, SCAD, SICA, MCP and Elastic net were used to identify metastasis related genes.
Five regression models with penalties were applied in the additive hazards model and jointly found 9 genes including , , , , -1, , , and . According the median of the prognostic index calculated using the regression coefficients of the penalized additive hazards model, the patients were labeled as high/low risk groups. A significant difference was detected in the survival curves of the identified groups. The selected genes were examined using validation data and were significantly associated with the hazard of metastasis.
This study showed that , -1, , , and are the potential recurrence and metastatic predictors in breast cancer and can be taken into account as candidates for further research in tumorigenesis, invasion, metastasis, and epithelial-mesenchymal transition of breast cancer.
乳腺癌(BC)已被报道为全球女性中诊断出的最常见癌症之一。乳腺癌患者的生存率受转移影响。因此,探索其潜在机制并使用新的统计方法识别相关生物标志物以监测乳腺癌复发/再发至关重要。本研究使用惩罚性相加风险回归模型研究了乳腺癌患者的高维基因表达谱。
使用了一个与乳腺癌患者转移时间相关的公开可用数据集(GSE2034)。有与286例乳腺癌患者相关的22283个基因表达谱信息。使用包括LASSO、SCAD、SICA、MCP和弹性网在内的具有不同惩罚的惩罚性相加风险回归模型来识别转移相关基因。
在相加风险模型中应用了五个带惩罚的回归模型,共同发现了9个基因,包括 、 、 、 、-1、 、 、 和 。根据使用惩罚性相加风险模型的回归系数计算的预后指数中位数,将患者标记为高/低风险组。在识别出的组的生存曲线中检测到显著差异。使用验证数据对所选基因进行了检验,它们与转移风险显著相关。
本研究表明, 、-1、 、 、 和 是乳腺癌潜在的复发和转移预测因子,可作为乳腺癌肿瘤发生、侵袭、转移和上皮-间质转化进一步研究的候选者。