Hendricks S A, Wassell J T, Collins J W, Sedlak S L
Center for Disease Control and Prevention, Division of Safety Research, Morgantown, WV 26505-2888, USA.
Stat Med. 1996;15(17-18):1951-60. doi: 10.1002/(sici)1097-0258(19960930)15:18<1951::aid-sim407>3.0.co;2-p.
Study designs in public health research often require the estimation of intervention effects that have been applied to a cluster of subjects in a common geographic area, rather than randomly assigned to individual subjects, and where the outcome is dichotomous. Statistical methods that account for the intracluster correlation of measurements must be used or the standard errors of regression coefficients will be under-estimated. Generalized estimating equations (GEE) can be used to account for this correlation, although there are no straightforward methods to determine sample-size requirements for adequate power. A simulation study was performed to calculate power in a GEE model for a proposed study of the effect of an intervention, designed to reduce lower-back injuries among nursing personnel employed in nursing homes. Nursing homes will be randomly assigned to either an intervention or control group and all employees within a nursing home will be treated alike. Historical injury data indicates that the baseline-injury risk for each home can be reasonably modelled using a beta distribution. It is assumed that the risk for any individual nurse within a nursing home follows a Bernoulli probability distribution expressed as a logit function of fixed covariates, which have values of odds ratios determined from previous studies which represent characteristics of the study population, and a random-intercept term which is specific for each home. Results indicate that failure to account for intracluster correlation can lead to overestimates of power as well as inflation of type I error by as much as 20 per cent. Although the GEE method accounted for the intracluster correlation when present, estimates of the intracluster correlation were negatively biased when no intracluster correlation was present. In addition, and possibly related to the negatively biased estimates of intracluster correlation, we also found inflated type I error estimates from the GEE method.
公共卫生研究中的研究设计通常需要估计干预效果,这些干预措施应用于同一地理区域内的一组受试者,而非随机分配给个体受试者,且结果为二分变量。必须使用考虑测量值群内相关性的统计方法,否则回归系数的标准误将被低估。广义估计方程(GEE)可用于考虑这种相关性,尽管没有直接的方法来确定达到足够检验效能所需的样本量。进行了一项模拟研究,以计算在一个GEE模型中针对一项拟议的干预效果研究的检验效能,该干预旨在减少养老院护理人员的下背部损伤。养老院将被随机分配到干预组或对照组,且养老院中的所有员工将接受相同的处理。历史损伤数据表明,每个养老院的基线损伤风险可以合理地用贝塔分布建模。假设养老院中任何一名护士的风险遵循伯努利概率分布,该分布表示为固定协变量的对数函数,协变量的比值比由先前研究确定,代表研究人群的特征,以及一个特定于每个养老院的随机截距项。结果表明,不考虑群内相关性会导致检验效能高估以及I型错误率高达20%的膨胀。尽管GEE方法在存在群内相关性时考虑了群内相关性,但在不存在群内相关性时,群内相关性的估计存在负偏差。此外,可能与群内相关性的负偏差估计有关,我们还发现GEE方法的I型错误估计值膨胀。