Zarean Shahraki Saba, Azizmohammad Looha Mehdi, Mohammadi Kazaj Pooya, Aria Mehrad, Akbari Atieh, Emami Hassan, Asadi Farkhondeh, Akbari Mohammad Esmaeil
Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Front Oncol. 2023 Jun 5;13:1147604. doi: 10.3389/fonc.2023.1147604. eCollection 2023.
Breast cancer (BC) survival prediction can be a helpful tool for identifying important factors selecting the effective treatment reducing mortality rates. This study aims to predict the time-related survival probability of BC patients in different molecular subtypes over 30 years of follow-up.
This study retrospectively analyzed 3580 patients diagnosed with invasive breast cancer (BC) from 1991 to 2021 in the Cancer Research Center of Shahid Beheshti University of Medical Science. The dataset contained 18 predictor variables and two dependent variables, which referred to the survival status of patients and the time patients survived from diagnosis. Feature importance was performed using the random forest algorithm to identify significant prognostic factors. Time-to-event deep-learning-based models, including Nnet-survival, DeepHit, DeepSurve, NMLTR and Cox-time, were developed using a grid search approach with all variables initially and then with only the most important variables selected from feature importance. The performance metrics used to determine the best-performing model were C-index and IBS. Additionally, the dataset was clustered based on molecular receptor status (i.e., luminal A, luminal B, HER2-enriched, and triple-negative), and the best-performing prediction model was used to estimate survival probability for each molecular subtype.
The random forest method identified tumor state, age at diagnosis, and lymph node status as the best subset of variables for predicting breast cancer (BC) survival probabilities. All models yielded very close performance, with Nnet-survival (C-index=0.77, IBS=0.13) slightly higher using all 18 variables or the three most important variables. The results showed that the Luminal A had the highest predicted BC survival probabilities, while triple-negative and HER2-enriched had the lowest predicted survival probabilities over time. Additionally, the luminal B subtype followed a similar trend as luminal A for the first five years, after which the predicted survival probability decreased steadily in 10- and 15-year intervals.
This study provides valuable insight into the survival probability of patients based on their molecular receptor status, particularly for HER2-positive patients. This information can be used by healthcare providers to make informed decisions regarding the appropriateness of medical interventions for high-risk patients. Future clinical trials should further explore the response of different molecular subtypes to treatment in order to optimize the efficacy of breast cancer treatments.
乳腺癌(BC)生存预测是一种有助于识别重要因素、选择有效治疗方法以降低死亡率的工具。本研究旨在预测不同分子亚型的BC患者在30年随访期内与时间相关的生存概率。
本研究回顾性分析了1991年至2021年在沙希德·贝赫什提医科大学癌症研究中心诊断为浸润性乳腺癌(BC)的3580例患者。数据集包含18个预测变量和两个因变量,分别指患者的生存状态和从诊断到生存的时间。使用随机森林算法进行特征重要性分析,以识别显著的预后因素。基于事件时间的深度学习模型,包括Nnet-survival、DeepHit、DeepSurve、NMLTR和Cox-time,最初使用所有变量,然后仅使用从特征重要性中选择的最重要变量,通过网格搜索方法进行开发。用于确定最佳模型的性能指标是C指数和IBS。此外,根据分子受体状态(即腔面A型、腔面B型、HER2富集型和三阴性)对数据集进行聚类,并使用最佳预测模型估计每种分子亚型的生存概率。
随机森林方法确定肿瘤状态、诊断时年龄和淋巴结状态是预测乳腺癌(BC)生存概率的最佳变量子集。所有模型的性能非常接近,使用全部18个变量或三个最重要变量时,Nnet-survival(C指数=0.77,IBS=0.13)略高。结果表明,随着时间的推移,腔面A型的预测BC生存概率最高,而三阴性和HER2富集型的预测生存概率最低。此外,腔面B亚型在前五年与腔面A型趋势相似,此后预测生存概率在10年和15年间隔内稳步下降。
本研究基于患者的分子受体状态,特别是HER2阳性患者,对其生存概率提供了有价值的见解。医疗保健提供者可利用这些信息对高危患者的医疗干预适宜性做出明智决策。未来的临床试验应进一步探索不同分子亚型对治疗的反应,以优化乳腺癌治疗的疗效。