Carlin J B, Wolfe R, Brown C H, Gelman A
Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute and University of Melbourne Department of Paediatrics, Parkville, VIC 3052, Australia.
Biostatistics. 2001 Dec;2(4):397-416. doi: 10.1093/biostatistics/2.4.397.
Recent advances in statistical software have led to the rapid diffusion of new methods for modelling longitudinal data. Multilevel (also known as hierarchical or random effects) models for binary outcomes have generally been based on a logistic-normal specification, by analogy with earlier work for normally distributed data. The appropriate application and interpretation of these models remains somewhat unclear, especially when compared with the computationally more straightforward semiparametric or 'marginal' modelling (GEE) approaches. In this paper we pose two interrelated questions. First, what limits should be placed on the interpretation of the coefficients and inferences derived from random-effect models involving binary outcomes? Second, what diagnostic checks are appropriate for evaluating whether such random-effect models provide adequate fits to the data? We address these questions by means of an extended case study using data on adolescent smoking from a large cohort study. Bayesian estimation methods are used to fit a discrete-mixture alternative to the standard logistic-normal model, and posterior predictive checking is used to assess model fit. Surprising parallels in the parameter estimates from the logistic-normal and mixture models are described and used to question the interpretability of the so-called 'subject-specific' regression coefficients from the standard multilevel approach. Posterior predictive checks suggest a serious lack of fit of both multilevel models. The results do not provide final answers to the two questions posed, but we expect that lessons learned from the case study will provide general guidance for further investigation of these important issues.
统计软件的最新进展促使用于纵向数据建模的新方法迅速传播。二元结局的多水平(也称为分层或随机效应)模型通常基于逻辑正态规范,这是类比早期针对正态分布数据的工作。这些模型的恰当应用和解释仍有些不明确,特别是与计算上更直接的半参数或“边际”建模(广义估计方程)方法相比时。在本文中,我们提出两个相互关联的问题。第一,对于涉及二元结局的随机效应模型得出的系数解释和推断应设置哪些限制?第二,哪些诊断检验适用于评估此类随机效应模型是否对数据提供了充分拟合?我们通过一项扩展的案例研究来解决这些问题,该研究使用了来自一项大型队列研究的青少年吸烟数据。采用贝叶斯估计方法来拟合标准逻辑正态模型的离散混合替代模型,并使用后验预测检验来评估模型拟合。描述了逻辑正态模型和混合模型参数估计中令人惊讶的相似之处,并据此对标准多水平方法中所谓的“个体特定”回归系数的可解释性提出质疑。后验预测检验表明这两种多水平模型均严重缺乏拟合。研究结果并未为所提出的两个问题提供最终答案,但我们期望从该案例研究中吸取的经验教训将为进一步研究这些重要问题提供一般性指导。