Lin Feng-Chang, Cai Jianwen, Fine Jason P, Lai Huichuan J
Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A.
Department of Nutritional Sciences, University of Wisconsin, Madison, Wisconsin 53706, U.S.A.
Biometrika. 2013;100(3). doi: 10.1093/biomet/ast016.
Recurrent event data frequently arise in longitudinal studies when study subjects possibly experience more than one event during the observation period. Often, such recurrent events can be categorized. However, part of the categorization may be missing due to technical difficulties. If the event types are missing completely at random, then a complete case analysis may provide consistent estimates of regression parameters in certain regression models, but estimates of the baseline event rates are generally biased. Previous work on nonparametric estimation of these rates has utilized parametric missingness models. In this paper, we develop fully nonparametric methods in which the missingness mechanism is completely unspecified. Consistency and asymptotic normality of the nonparametric estimators of the mean event functions accommodate nonparametric estimators of the event category probabilities, which converge more slowly than the parametric rate. Plug-in variance estimators are provided and perform well in simulation studies, where complete case estimators may exhibit large biases and parametric estimators generally have a larger mean squared error when the model is misspecified. The proposed methods are applied to data from a cystic fibrosis registry.
在纵向研究中,当研究对象在观察期内可能经历不止一次事件时,经常会出现复发事件数据。通常,此类复发事件可以进行分类。然而,由于技术困难,部分分类可能会缺失。如果事件类型完全是随机缺失的,那么在某些回归模型中,完整病例分析可能会提供回归参数的一致估计,但基线事件发生率的估计通常存在偏差。先前关于这些发生率的非参数估计的工作使用了参数化缺失模型。在本文中,我们开发了完全非参数的方法,其中缺失机制完全未指定。平均事件函数的非参数估计量的一致性和渐近正态性适用于事件类别概率的非参数估计量,其收敛速度比参数化发生率更慢。我们提供了插件方差估计量,并且在模拟研究中表现良好,在模拟研究中,当模型指定错误时,完整病例估计量可能会表现出较大偏差,而参数化估计量通常具有更大的均方误差。所提出的方法应用于来自囊性纤维化登记处的数据。