Wu Fan, Kim Sehee, Qin Jing, Saran Rajiv, Li Yi
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A.
Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, U.S.A.
Biometrics. 2018 Mar;74(1):100-108. doi: 10.1111/biom.12746. Epub 2017 Aug 29.
Survival data collected from a prevalent cohort are subject to left truncation and the analysis is challenging. Conditional approaches for left-truncated data could be inefficient as they ignore the information in the marginal likelihood of the truncation times. Length-biased sampling methods may improve the estimation efficiency but only when the underlying truncation time is uniform; otherwise, they may generate biased estimates. We propose a semiparametric method for left-truncated data under the Cox model with no parametric distributional assumption about the truncation times. Our approach is to make inference based on the conditional likelihood augmented with a pairwise likelihood, which eliminates the truncation distribution, yet retains the information about the regression coefficients and the baseline hazard function in the marginal likelihood. An iterative algorithm is provided to solve for the regression coefficients and the baseline hazard function simultaneously. By empirical process and U-process theories, it has been shown that the proposed estimator is consistent and asymptotically normal with a closed-form consistent variance estimator. Simulation studies show substantial efficiency gain of our estimator in both the regression coefficients and the cumulative baseline hazard function over the conditional approach estimator. When the uniform truncation assumption holds, our estimator enjoys smaller biases and efficiency comparable to that of the full maximum likelihood estimator. An application to the analysis of a chronic kidney disease cohort study illustrates the utility of the method.
从现患队列收集的生存数据存在左截断问题,分析具有挑战性。针对左截断数据的条件方法可能效率低下,因为它们忽略了截断时间边际似然中的信息。长度偏倚抽样方法可能会提高估计效率,但仅当潜在截断时间均匀时才有效;否则,它们可能会产生有偏估计。我们提出了一种在Cox模型下针对左截断数据的半参数方法,对截断时间不做参数分布假设。我们的方法是基于用成对似然增强的条件似然进行推断,这消除了截断分布,同时保留了边际似然中关于回归系数和基线风险函数的信息。提供了一种迭代算法来同时求解回归系数和基线风险函数。通过经验过程和U过程理论,已证明所提出的估计量是一致的,并且具有封闭形式的一致方差估计量时渐近正态。模拟研究表明,我们的估计量在回归系数和累积基线风险函数方面比条件方法估计量有显著的效率提升。当均匀截断假设成立时,我们的估计量偏差更小,效率与完全最大似然估计量相当。对一项慢性肾病队列研究的分析应用说明了该方法的实用性。