Department of Psychology, University of Southern California, Los Angeles, CA, USA.
Multivariate Behav Res. 2021 Jul-Aug;56(4):558-578. doi: 10.1080/00273171.2020.1746902. Epub 2020 Apr 11.
Although many methodologists and professional organizations have urged applied researchers to compute and report effect size measures accompanying tests of statistical significance, discussions on obtaining confidence intervals (CIs) for effect size with clustered/multilevel data have been scarce. In this paper, I explore the bootstrap as a viable and accessible alternative for obtaining CIs for multilevel standardized mean difference effect size for cluster-randomized trials. A simulation was carried out to compare 17 analytic and bootstrap procedures for constructing CIs for multilevel effect size, in terms of empirical coverage rate and width, for both normal and nonnormal data. Results showed that, overall, the residual bootstrap with studentized CI had the best coverage rates (94.75% on average), whereas the residual bootstrap with basic CI had better coverage in small samples. These two procedures for constructing CIs showed better coverage than using analytic methods for both normal and nonnormal data. In addition, I provide an illustrative example showing how bootstrap CIs for multilevel effect size can be easily obtained using the statistical software R and the R package bootmlm. I strongly encourage applied researchers to report CIs to adequately convey the uncertainty of their effect size estimates.
尽管许多方法学家和专业组织都敦促应用研究人员在进行统计显著性检验的同时计算和报告效应大小度量,但对于如何为聚类/多层次数据获取效应大小置信区间 (CI) 的讨论却很少。在本文中,我探讨了使用自举法作为一种可行且易于使用的替代方法,以获取群组随机试验中多层次标准化均数差异效应大小的 CI。通过模拟,比较了 17 种分析和自举程序,以构建多层次效应大小的 CI,针对正常和非正态数据,评估了它们的经验覆盖率和宽度。结果表明,总体而言,带有学生化 CI 的残差自举法具有最佳的覆盖率(平均为 94.75%),而带有基本 CI 的残差自举法则在小样本中具有更好的覆盖率。对于正常和非正态数据,这两种构建 CI 的方法的覆盖率均优于使用分析方法。此外,我提供了一个说明性示例,展示了如何使用统计软件 R 和 R 包 bootmlm 轻松获取多层次效应大小的自举 CI。我强烈鼓励应用研究人员报告 CI,以充分传达其效应大小估计的不确定性。