1 Theoretical and Applied Statistics Laboratory, Pierre and Marie Curie University, Paris, France.
2 Laboratoire de Probabilités Statistique et Modélisation (LPSM), UMR 8001, Sorbonne University, Paris, France.
Stat Methods Med Res. 2019 May;28(5):1523-1539. doi: 10.1177/0962280218766389. Epub 2018 Apr 15.
We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction. Inference is achieved using an efficient Quasi-Newton Expectation Maximization algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study and finally illustrated on three publicly available genetic cancer datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC( t) and survival prediction. Thus, we propose a powerful tool for personalized medicine in cancerology.
我们提出了一种有监督学习的删失持续时间混合模型(C-mix),用于同时检测具有不同预后的患者亚组,并根据风险对其进行排序。我们的方法适用于高维环境,即具有大量的生物医学协变量。事实上,我们通过弹性网络对负对数似然进行惩罚,这导致模型的参数稀疏化,并自动确定与生存预测相关的协变量。使用高效的拟牛顿期望最大化算法进行推断,我们为此提供了收敛性质。该方法的统计性能在广泛的蒙特卡罗模拟研究中进行了检验,最后在三个具有高维协变量的公开可用的遗传癌症数据集上进行了说明。我们表明,在这种情况下,我们的方法在 C 指数、AUC(t) 和生存预测方面均优于该领域的最新生存模型,即 CURE 和 Cox 比例风险模型通过弹性网络进行惩罚。因此,我们为癌症学的个性化医学提供了一种强大的工具。