Zhou Qingning, Cao Xu
Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, NC 28223, United States.
Department of Statistics, University of California at Riverside, Riverside, CA 92521, United States.
Biometrics. 2025 Apr 2;81(2). doi: 10.1093/biomtc/ujaf059.
In the studies of time-to-event outcomes, it often happens that a fraction of subjects will never experience the event of interest, and these subjects are said to be cured. The studies with a cure fraction often yield a low event rate. To reduce cost and enhance study power, 2-phase sampling designs are often adopted, especially when the covariates of interest are expensive to measure or obtain. In this paper, we consider the generalized case-cohort design for studies with a cure fraction. Under this design, the expensive covariates are measured for a subset of the study cohort, called subcohort, and for all or a subset of the remaining subjects outside the subcohort who have experienced the event during the study, called cases. We propose a 2-step estimation procedure under a class of semiparametric transformation mixture cure models. We first develop a sieve maximum weighted likelihood method based only on the complete data and also devise an Expectation-Maximization (EM) algorithm for implementation. We then update the resulting estimator via a working model between the outcome and cheap covariates or auxiliary variables using the full data. We show that the proposed update estimator is consistent and asymptotically at least as efficient as the complete-data estimator, regardless of whether the working model is correctly specified or not. We also propose a weighted bootstrap procedure for variance estimation. Extensive simulation studies demonstrate the superior performance of the proposed method in finite-sample. An application to the National Wilms' Tumor Study is provided for illustration.
在生存时间结局的研究中,经常会出现一部分受试者永远不会经历感兴趣的事件,这些受试者被称为治愈者。存在治愈比例的研究通常事件发生率较低。为了降低成本并提高研究效能,经常采用两阶段抽样设计,特别是当感兴趣的协变量测量或获取成本很高时。在本文中,我们考虑存在治愈比例的研究的广义病例队列设计。在此设计下,对于研究队列的一个子集(称为子队列)以及子队列之外在研究期间经历了事件的所有或部分其余受试者(称为病例)测量昂贵的协变量。我们在一类半参数变换混合治愈模型下提出了一种两步估计程序。我们首先仅基于完整数据开发一种筛法最大加权似然方法,并设计一种期望最大化(EM)算法用于实现。然后我们使用完整数据通过结局与廉价协变量或辅助变量之间的工作模型更新所得估计量。我们表明,无论工作模型是否正确设定,所提出的更新估计量都是一致的,并且渐近地至少与完整数据估计量一样有效。我们还提出了一种用于方差估计的加权自助法程序。大量的模拟研究证明了所提出方法在有限样本中的优越性能。提供了一个对国家肾母细胞瘤研究的应用示例。