Alberta Machine Intelligence Institute, Edmonton, Alberta, Canada.
Department of Surgery, University of Alberta, Edmonton, Alberta, Canada.
Clin Cancer Res. 2023 Oct 2;29(19):3924-3936. doi: 10.1158/1078-0432.CCR-22-3493.
Personalized medicine attempts to predict survival time for each patient, based on their individual tumor molecular profile. We investigate whether our survival learner in combination with a dimension reduction method can produce useful survival estimates for a variety of patients with cancer.
This article provides a method that learns a model for predicting the survival time for individual patients with cancer from the PanCancer Atlas: given the (16,335 dimensional) gene expression profiles from 10,173 patients, each having one of 33 cancers, this method uses unsupervised nonnegative matrix factorization (NMF) to reexpress the gene expression data for each patient in terms of 100 learned NMF factors. It then feeds these 100 factors into the Multi-Task Logistic Regression (MTLR) learner to produce cancer-specific models for each of 20 cancers (with >50 uncensored instances); this produces "individual survival distributions" (ISD), which provide survival probabilities at each future time for each individual patient, which provides a patient's risk score and estimated survival time.
Our NMF-MTLR concordance indices outperformed the VAECox benchmark by 14.9% overall. We achieved optimal survival prediction using pan-cancer NMF in combination with cancer-specific MTLR models. We provide biological interpretation of the NMF model and clinical implications of ISDs for prognosis and therapeutic response prediction.
NMF-MTLR provides many benefits over other models: superior model discrimination, superior calibration, meaningful survival time estimates, and accurate probabilistic estimates of survival over time for each individual patient. We advocate for the adoption of these cancer survival models in clinical and research settings.
个性化医学试图根据患者的个体肿瘤分子特征预测每个患者的生存时间。我们研究了我们的生存学习器与降维方法相结合,是否可以为各种癌症患者产生有用的生存估计。
本文提供了一种方法,用于从 PanCancer Atlas 中学习预测个体癌症患者生存时间的模型:给定来自 10173 名患者的(16335 维)基因表达谱,每个患者有 33 种癌症中的一种,该方法使用无监督非负矩阵分解(NMF)将每个患者的基因表达数据重新表示为 100 个学习到的 NMF 因子。然后,它将这 100 个因子输入到多任务逻辑回归(MTLR)学习器中,为 20 种癌症(每个癌症都有>50 个未删失的实例)中的每一种癌症生成癌症特异性模型;这会产生“个体生存分布”(ISD),为每个个体患者提供每个未来时间的生存概率,这为患者提供了风险评分和估计的生存时间。
我们的 NMF-MTLR 一致性指数总体上优于 VAECox 基准 14.9%。我们通过使用泛癌 NMF 与癌症特异性 MTLR 模型相结合,实现了最佳的生存预测。我们提供了 NMF 模型的生物学解释和 ISD 对预后和治疗反应预测的临床意义。
与其他模型相比,NMF-MTLR 具有许多优势:优越的模型区分度、优越的校准、有意义的生存时间估计以及每个个体患者随时间变化的准确概率生存估计。我们主张在临床和研究环境中采用这些癌症生存模型。