Begg Melissa D, Parides Michael K
Department of Biostatistics, Mailman School of Public Health of Columbia University, 722 West 168th Street (R626B), New York, NY 10032, U.S.A.
Stat Med. 2003 Aug 30;22(16):2591-602. doi: 10.1002/sim.1524.
The focus of this paper is regression analysis of clustered data. Although the presence of intracluster correlation (the tendency for items within a cluster to respond alike) is typically viewed as an obstacle to good inference, the complex structure of clustered data offers significant analytic advantages over independent data. One key advantage is the ability to separate effects at the individual (or item-specific) level and the group (or cluster-specific) level. We review different approaches for the separation of individual-level and cluster-level effects on response, their appropriate interpretation and give recommendations for model fitting based on the intent of the data analyst. Unlike many earlier papers on this topic, we place particular emphasis on the interpretation of the cluster-level covariate effect. The main ideas of the paper are highlighted in an analysis of the relationship between birth weight and IQ using sibling data from a large birth cohort study.
本文的重点是聚类数据的回归分析。尽管聚类内相关性(即聚类内项目有相似反应的趋势)的存在通常被视为良好推断的障碍,但聚类数据的复杂结构相对于独立数据具有显著的分析优势。一个关键优势是能够区分个体(或项目特定)层面和群体(或聚类特定)层面的效应。我们回顾了分离个体层面和聚类层面效应以影响反应的不同方法、它们的恰当解释,并根据数据分析人员的意图给出模型拟合建议。与许多关于该主题的早期论文不同,我们特别强调聚类层面协变量效应的解释。本文的主要观点在一项使用来自大型出生队列研究的兄弟姐妹数据对出生体重与智商之间关系的分析中得到了突出体现。