Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, Ohio, USA.
Clara D. Bloomfield Center for Leukemia Outcomes Research, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA.
Stat Med. 2022 Sep 30;41(22):4340-4366. doi: 10.1002/sim.9513. Epub 2022 Jul 6.
Medical breakthroughs in recent years have led to cures for many diseases. The mixture cure model (MCM) is a type of survival model that is often used when a cured fraction exists. Many have sought to identify genomic features associated with a time-to-event outcome which requires variable selection strategies for high-dimensional spaces. Unfortunately, currently few variable selection methods exist for MCMs especially when there are more predictors than samples. This study develops high-dimensional penalized Weibull MCMs, which allow for identification of prognostic factors associated with both cure status and/or survival. We demonstrated how such models may be estimated using two different iterative algorithms. The model-X knockoffs method was combined with these algorithms to control the false discovery rate (FDR) in variable selection. Through extensive simulation studies, our penalized MCMs have been shown to outperform alternative methods on multiple metrics and achieve high statistical power with FDR being controlled. In an acute myeloid leukemia (AML) application with gene expression data, our proposed approach identified 14 genes associated with potential cure and 12 genes with time-to-relapse, which may help inform treatment decisions for AML patients.
近年来的医学突破已经带来了许多疾病的治愈方法。混合治愈模型(MCM)是一种生存模型,当存在治愈部分时通常会使用这种模型。许多人试图确定与事件时间结果相关的基因组特征,这需要针对高维空间的变量选择策略。不幸的是,目前很少有用于 MCM 的变量选择方法,特别是当预测器数量多于样本数量时。本研究开发了高维惩罚 Weibull MCM,可用于识别与治愈状态和/或生存相关的预后因素。我们展示了如何使用两种不同的迭代算法来估计此类模型。模型-X 置换方法与这些算法相结合,以控制变量选择中的错误发现率(FDR)。通过广泛的模拟研究,我们的惩罚 MCM 在多种指标上优于替代方法,并在控制 FDR 的情况下实现了高统计功效。在一项具有基因表达数据的急性髓细胞白血病(AML)应用中,我们提出的方法确定了 14 个与潜在治愈相关的基因和 12 个与复发时间相关的基因,这可能有助于为 AML 患者提供治疗决策。