Roy Jason
Department of Biostatistics and Computational Biology, 601 Elmwood Ave., University of Rochester, Rochester, New York, USA.
Biometrics. 2003 Dec;59(4):829-36. doi: 10.1111/j.0006-341x.2003.00097.x.
In longitudinal studies with dropout, pattern-mixture models form an attractive modeling framework to account for nonignorable missing data. However, pattern-mixture models assume that the components of the mixture distribution are entirely determined by the dropout times. That is, two subjects with the same dropout time have the same distribution for their response with probability one. As that is unlikely to be the case, this assumption made lead to classification error. In addition, if there are certain dropout patterns with very few subjects, which often occurs when the number of observation times is relatively large, pattern-specific parameters may be weakly identified or require identifying restrictions. We propose an alternative approach, which is a latent-class model. The dropout time is assumed to be related to the unobserved (latent) class membership, where the number of classes is less than the number of observed patterns; a regression model for the response is specified conditional on the latent variable. This is a type of shared-parameter model, where the shared "parameter" is discrete. Parameter estimates are obtained using the method of maximum likelihood. Averaging the estimates of the conditional parameters over the distribution of the latent variable yields estimates of the marginal regression parameters. The methodology is illustrated using longitudinal data on depression from a study of HIV in women.
在存在失访的纵向研究中,模式混合模型构成了一个有吸引力的建模框架,用于处理不可忽略的缺失数据。然而,模式混合模型假定混合分布的各个成分完全由失访时间决定。也就是说,两个失访时间相同的受试者,其反应具有相同分布的概率为1。但实际情况不太可能如此,这种假设可能导致分类错误。此外,如果存在某些受试者极少的失访模式(当观察次数相对较多时经常出现这种情况),特定模式的参数可能难以识别或需要识别性限制条件。我们提出了一种替代方法,即潜在类别模型。假定失访时间与未观察到的(潜在)类别归属有关,其中类别数量少于观察到的模式数量;基于潜在变量指定反应的回归模型。这是一种共享参数模型,其中共享的“参数”是离散的。使用最大似然法获得参数估计值。对潜在变量分布上的条件参数估计值求平均,可得到边际回归参数的估计值。利用一项关于女性艾滋病毒研究中的抑郁症纵向数据对该方法进行了说明。