Fitzmaurice G M
Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
Biometrics. 1995 Mar;51(1):309-17.
Clustered binary data occur commonly in both the biomedical and health sciences. In this paper, we consider logistic regression models for multivariate binary responses, where the association between the responses is largely regarded as a nuisance characteristic of the data. In particular, we consider the estimator based on independence estimating equations (IEE), which assumes that the responses are independent. This estimator has been shown to be nearly efficient when compared with maximum likelihood (ML) and generalized estimating equations (GEE) in a variety of settings. The purpose of this paper is to highlight a circumstance where assuming independence can lead to quite substantial losses of efficiency. In particular, when the covariate design includes within-cluster covariates, assuming independence can lead to a considerable loss of efficiency in estimating the regression parameters associated with those covariates.
聚类二元数据在生物医学和健康科学中普遍存在。在本文中,我们考虑用于多变量二元响应的逻辑回归模型,其中响应之间的关联在很大程度上被视为数据的一个干扰特征。特别地,我们考虑基于独立性估计方程(IEE)的估计器,该估计器假设响应是独立的。在各种情况下,与最大似然估计(ML)和广义估计方程(GEE)相比,这个估计器已被证明几乎是有效的。本文的目的是强调一种情况,即假设独立性可能会导致相当大的效率损失。特别是,当协变量设计包括聚类内协变量时,假设独立性可能会导致在估计与这些协变量相关的回归参数时效率大幅损失。