Neuhaus J M, Jewell N P
Department of Epidemiology, University of California, San Francisco, 94143-0560.
Biometrics. 1990 Dec;46(4):977-90.
Recently a great deal of attention has been given to binary regression models for clustered or correlated observations. The data of interest are of the form of a binary dependent or response variable, together with independent variables X1,...., Xk, where sets of observations are grouped together into clusters. A number of models and methods of analysis have been suggested to study such data. Many of these are extensions in some way of the familiar logistic regression model for binary data that are not grouped (i.e., each cluster is of size 1). In general, the analyses of these clustered data models proceed by assuming that the observed clusters are a simple random sample of clusters selected from a population of clusters. In this paper, we consider the application of these procedures to the case where the clusters are selected randomly in a manner that depends on the pattern of responses in the cluster. For example, we show that ignoring the retrospective nature of the sample design, by fitting standard logistic regression models for clustered binary data, may result in misleading estimates of the effects of covariates and the precision of estimated regression coefficients.
最近,对于聚类或相关观测的二元回归模型给予了大量关注。感兴趣的数据形式为二元因变量或响应变量,以及自变量X1,…,Xk,其中观测集被分组为聚类。已经提出了许多模型和分析方法来研究此类数据。其中许多在某种程度上是针对未分组的二元数据(即每个聚类大小为1)的熟悉逻辑回归模型的扩展。一般来说,这些聚类数据模型的分析是通过假设观测到的聚类是从聚类总体中选取的聚类的简单随机样本进行的。在本文中,我们考虑将这些程序应用于聚类以依赖于聚类中响应模式的方式随机选择的情况。例如,我们表明,通过为聚类二元数据拟合标准逻辑回归模型而忽略样本设计的回顾性性质,可能会导致协变量效应的误导性估计以及估计回归系数的精度。