Adam Nabil, Wieder Robert
Phalcon, LLC, Manhasset, NY 11030, USA.
Newark Campus, Rutgers University, Newark, NJ 07102, USA.
Cancers (Basel). 2024 Oct 18;16(20):3527. doi: 10.3390/cancers16203527.
Deep learning (DL)-based models for predicting the survival of patients with local stages of breast cancer only use time-fixed covariates, i.e., patient and cancer data at the time of diagnosis. These predictions are inherently error-prone because they do not consider time-varying events that occur after initial diagnosis. Our objective is to improve the predictive modeling of survival of patients with localized breast cancer to consider both time-fixed and time-varying events; thus, we take into account the progression of a patient's health status over time.
We extended four DL-based predictive survival models (DeepSurv, DeepHit, Nnet-survival, and Cox-Time) that deal with right-censored time-to-event data to consider not only a patient's time-fixed covariates (patient and cancer data at diagnosis) but also a patient's time-varying covariates (e.g., treatments, comorbidities, progressive age, frailty index, adverse events from treatment). We utilized, as our study data, the SEER-Medicare linked dataset from 1991 to 2016 to study a population of women diagnosed with stage I-III breast cancer (BC) enrolled in Medicare at 65 years or older as qualified by age. We delineated time-fixed variables recorded at the time of diagnosis, including age, race, marital status, breast cancer stage, tumor grade, laterality, estrogen receptor (ER), progesterone receptor (PR), and human epidermal receptor 2 (HER2) status, and comorbidity index. We analyzed six distinct prognostic categories, cancer stages I-III BC, and each stage's ER/PR+ or ER/PR- status. At each visit, we delineated the time-varying covariates of administered treatments, induced adverse events, comorbidity index, and age. We predicted the survival of three hypothetical patients to demonstrate the model's utility.
The primary outcomes of the modeling were the measures of the model's prediction error, as measured by the concordance index, the most commonly applied evaluation metric in survival analysis, and the integrated Brier score, a metric of the model's discrimination and calibration.
The proposed extended patients' covariates that include both time-fixed and time-varying covariates significantly improved the deep learning models' prediction error and the discrimination and calibration of a model's estimates. The prediction of the four DL models using time-fixed covariates in six different prognostic categories all resulted in approximately a 30% error in all six categories. When applying the proposed extension to include time-varying covariates, the accuracy of all four predictive models improved significantly, with the error decreasing to approximately 10%. The models' predictive accuracy was independent of the differing published survival predictions from time-fixed covariates in the six prognostic categories. We demonstrate the utility of the model in three hypothetical patients with unique patient, cancer, and treatment variables. The model predicted survival based on the patient's individual time-fixed and time-varying features, which varied considerably from Social Security age-based, and stage and race-based breast cancer survival predictions.
The predictive modeling of the survival of patients with early-stage breast cancer using DL models has a prediction error of around 30% when considering only time-fixed covariates at the time of diagnosis and decreases to values under 10% when time-varying covariates are added as input to the models, regardless of the prognostic category of the patient groups. These models can be used to predict individual patients' survival probabilities based on their unique repertoire of time-fixed and time-varying features. They will provide guidance for patients and their caregivers to assist in decision making.
基于深度学习(DL)的预测局部阶段乳腺癌患者生存率的模型仅使用固定时间协变量,即诊断时的患者和癌症数据。这些预测本质上容易出错,因为它们没有考虑初始诊断后发生的随时间变化的事件。我们的目标是改进局部乳腺癌患者生存率的预测模型,以同时考虑固定时间和随时间变化的事件;因此,我们考虑了患者健康状况随时间的进展。
我们扩展了四种基于DL的预测生存模型(DeepSurv、DeepHit、Nnet-survival和Cox-Time),这些模型用于处理右删失的事件发生时间数据,不仅考虑患者的固定时间协变量(诊断时的患者和癌症数据),还考虑患者的随时间变化的协变量(例如,治疗、合并症、年龄增长、虚弱指数、治疗引起的不良事件)。我们使用1991年至2016年的SEER - Medicare关联数据集作为研究数据,以研究65岁及以上符合年龄条件并参加医疗保险的I - III期乳腺癌(BC)女性人群。我们划定了诊断时记录的固定时间变量,包括年龄、种族、婚姻状况、乳腺癌分期、肿瘤分级、患侧、雌激素受体(ER)、孕激素受体(PR)和人表皮受体2(HER2)状态以及合并症指数。我们分析了六个不同的预后类别,即I - III期BC癌症,以及每个阶段的ER/PR +或ER/PR -状态。在每次随访时,我们划定了所给予治疗、诱发的不良事件、合并症指数和年龄等随时间变化的协变量。我们预测了三名假设患者的生存率,以证明模型的效用。
建模的主要结局是模型预测误差的度量,通过一致性指数(生存分析中最常用的评估指标)和综合Brier评分(模型区分度和校准度的指标)来衡量。
所提出的扩展患者协变量,包括固定时间和随时间变化的协变量,显著改善了深度学习模型的预测误差以及模型估计的区分度和校准度。在六个不同预后类别中使用固定时间协变量的四个DL模型的预测在所有六个类别中均导致约30%的误差。当应用所提出的扩展以纳入随时间变化的协变量时,所有四个预测模型的准确性均显著提高,误差降至约10%。模型的预测准确性与六个预后类别中固定时间协变量的不同已发表生存预测无关。我们在三名具有独特患者、癌症和治疗变量的假设患者中展示了模型的效用。该模型基于患者个体的固定时间和随时间变化的特征预测生存率,这与基于社会保障年龄、分期和种族的乳腺癌生存预测有很大差异。
使用DL模型对早期乳腺癌患者生存率进行预测建模时,仅考虑诊断时的固定时间协变量时预测误差约为30%,当将随时间变化的协变量作为模型输入添加时,误差降至10%以下,无论患者群体的预后类别如何。这些模型可用于根据患者独特的固定时间和随时间变化特征集预测个体患者的生存概率。它们将为患者及其护理人员提供指导以协助决策。