Biostatistics and Bioinformatics Branch, Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD 20852, U.S.A.
Stat Med. 2011 Jul 10;30(15):1825-36. doi: 10.1002/sim.4239. Epub 2011 Apr 15.
In many biomedical and epidemiological studies, data are often clustered due to longitudinal follow up or repeated sampling. While in some clustered data the cluster size is pre-determined, in others it may be correlated with the outcome of subunits, resulting in informative cluster size. When the cluster size is informative, standard statistical procedures that ignore cluster size may produce biased estimates. One attractive framework for modeling data with informative cluster size is the joint modeling approach in which a common set of random effects are shared by both the outcome and cluster size models. In addition to making distributional assumptions on the shared random effects, the joint modeling approach needs to specify the cluster size model. Questions arise as to whether the joint modeling approach is robust to misspecification of the cluster size model. In this paper, we studied both asymptotic and finite-sample characteristics of the maximum likelihood estimators in joint models when the cluster size model is misspecified. We found that using an incorrect distribution for the cluster size may induce small to moderate biases, while using a misspecified functional form for the shared random parameter in the cluster size model results in nearly unbiased estimation of outcome model parameters. We also found that there is little efficiency loss under this model misspecification. A developmental toxicity study was used to motivate the research and to demonstrate the findings.
在许多生物医学和流行病学研究中,由于纵向随访或重复采样,数据通常是聚类的。虽然在某些聚类数据中,聚类大小是预先确定的,但在其他情况下,它可能与亚单位的结果相关,从而导致信息丰富的聚类大小。当聚类大小时,忽略聚类大小时的标准统计程序可能会产生有偏差的估计。一种用于对具有信息丰富聚类大小的数据进行建模的有吸引力的框架是联合建模方法,其中共同的一组随机效应同时用于结果和聚类大小模型。除了对共享随机效应进行分布假设外,联合建模方法还需要指定聚类大小模型。对于聚类大小模型的指定是否会影响联合建模方法的稳健性,存在一些问题。在本文中,我们研究了当聚类大小模型指定不正确时,联合模型中最大似然估计的渐近和有限样本特征。我们发现,使用不正确的聚类大小分布可能会导致小到中度的偏差,而在聚类大小模型中使用不正确的共享随机参数函数形式会导致对结果模型参数的近乎无偏差估计。我们还发现,在这种模型指定错误下,效率损失很小。发育毒性研究被用来激发研究并展示研究结果。