Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada.
BMC Med Res Methodol. 2013 Jan 23;13:9. doi: 10.1186/1471-2288-13-9.
The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE)) and cluster-specific (i.e. random-effects logistic regression (RELR)) models for analyzing data from cluster randomized trials (CRTs) with missing binary responses.
In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI) and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE), and coverage probability.
GEE performs well on all four measures--provided the downward bias of the standard error (when the number of clusters per arm is small) is adjusted appropriately--under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF) <3; within-cluster MI for CRTs with VIF≥3 and cluster size>50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied.
GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.
本模拟研究旨在比较群体平均(即广义估计方程(GEE))和聚类特异性(即随机效应逻辑回归(RELR))模型在分析具有缺失二分类响应的聚类随机试验(CRT)数据时的准确性和效率。
在本模拟研究中,聚类响应是从贝塔二项式分布中生成的。每个试验臂的聚类数、每个聚类的受试者数、聚类内相关系数和缺失数据的百分比允许变化。在协变量相关缺失的假设下,缺失结果通过完整案例分析、标准多重插补(MI)和聚类内 MI 策略进行处理。使用 GEE 和 RELR 对数据进行分析。使用标准化偏差、经验标准误差、均方根误差(RMSE)和覆盖率来评估方法的性能。
在以下情况下,GEE 在所有四个指标上表现良好-前提是适当调整每个臂的聚类数较小时标准误差的向下偏差:CRT 中缺失数据量较少时采用完整案例分析;VIF<3 时采用标准 MI;VIF≥3 且聚类大小>50 时采用聚类内 MI。只有在缺失数据量较少且应用完整案例分析时,RELR 才能表现良好。
只要根据 CRT 的设计和缺失数据的百分比采用适当的缺失数据策略,GEE 就可以很好地执行。相比之下,在分析之前应用标准或聚类内 MI 策略时,RELR 的表现不佳。