Aselisewine Wisdom, Pal Suvra, Saulo Helton
Department of Mathematics, University of Texas at Arlington, Arlington, TX, USA.
Division of Data Science, College of Science, University of Texas at Arlington, Arlington, TX, USA.
J Appl Stat. 2024 Oct 23;52(6):1177-1194. doi: 10.1080/02664763.2024.2418476. eCollection 2025.
The mixture cure rate model (MCM) is the most widely used model for the analysis of survival data with a cured subgroup. In this context, the most common strategy to model the cure probability is to assume a generalized linear model with a known link function, such as the logit link function. However, the logit model can only capture simple effects of covariates on the cure probability. In this article, we propose a new MCM where the cure probability is modeled using a decision tree-based classifier and the survival distribution of the uncured is modeled using an accelerated failure time structure. To estimate the model parameters, we develop an expectation maximization algorithm. Our simulation study shows that the proposed model performs better in capturing nonlinear classification boundaries when compared to the logit-based MCM and the spline-based MCM. This results in more accurate and precise estimates of the cured probabilities, which in-turn results in improved predictive accuracy of cure. We further show that capturing nonlinear classification boundary also improves the estimation results corresponding to the survival distribution of the uncured subjects. Finally, we apply our proposed model and the EM algorithm to analyze an existing bone marrow transplant data.
混合治愈率模型(MCM)是用于分析具有治愈亚组的生存数据的最广泛使用的模型。在这种情况下,对治愈概率进行建模的最常见策略是假设一个具有已知链接函数的广义线性模型,例如对数链接函数。然而,对数模型只能捕捉协变量对治愈概率的简单影响。在本文中,我们提出了一种新的MCM,其中使用基于决策树的分类器对治愈概率进行建模,并使用加速失效时间结构对未治愈者的生存分布进行建模。为了估计模型参数,我们开发了一种期望最大化算法。我们的模拟研究表明,与基于对数的MCM和基于样条的MCM相比,所提出的模型在捕捉非线性分类边界方面表现更好。这导致对治愈概率的估计更加准确和精确,进而提高了治愈预测的准确性。我们进一步表明,捕捉非线性分类边界也改善了与未治愈受试者生存分布相对应的估计结果。最后,我们应用所提出的模型和EM算法来分析现有的骨髓移植数据。