Hedeker D, McMahon S D, Jason L A, Salina D
Division of Epidemiology and Biostatistics, School of Public Health, University of Illinois at Chicago 60612-7260, USA.
Am J Community Psychol. 1994 Oct;22(5):595-615. doi: 10.1007/BF02506895.
Although it is common in community psychology research to have data at both the community, or cluster, and individual level, the analysis of such clustered data often presents difficulties for many researchers. Since the individuals within the cluster cannot be assumed to be independent, the use of many traditional statistical techniques that assumes independence of observations is problematic. Further, there is often interest in assessing the degree of dependence in the data resulting from the clustering of individuals within communities. In this paper, a random-effects regression model is described for analysis of clustered data. Unlike ordinary regression analysis of clustered data, random-effects regression models do not assume that each observation is independent, but do assume data within clusters are dependent to some degree. The degree of this dependency is estimated along with estimates of the usual model parameters, thus adjusting these effects for the dependency resulting from the clustering of the data. Models are described for both continuous and dichotomous outcome variables, and available statistical software for these models is discussed. An analysis of a data set where individuals are clustered within firms is used to illustrate features of random-effects regression analysis, relative to both individual-level analysis which ignores the clustering of the data, and cluster-level analysis which aggregates the individual data.
尽管在社区心理学研究中,同时拥有社区(或群组)层面和个体层面的数据很常见,但对这类聚类数据的分析往往给许多研究人员带来困难。由于不能假定群组内的个体是相互独立的,所以使用许多假定观测值独立的传统统计技术存在问题。此外,人们通常还对评估因社区内个体聚类而导致的数据依赖程度感兴趣。本文描述了一种用于分析聚类数据的随机效应回归模型。与聚类数据的普通回归分析不同,随机效应回归模型并不假定每个观测值都是独立的,但假定群组内的数据在某种程度上是相关的。这种相关性的程度与通常的模型参数估计一起进行估计,从而针对数据聚类所导致的相关性对这些效应进行调整。文中描述了针对连续和二分结果变量的模型,并讨论了适用于这些模型的现有统计软件。通过对一个个体在公司内聚类的数据集进行分析,来说明随机效应回归分析的特点,将其与忽略数据聚类的个体层面分析以及汇总个体数据的群组层面分析进行比较。