Division of Epidemiology, School of Public Health, University of California, Berkeley, CA, USA.
BMC Med Res Methodol. 2011 Jun 20;11:94. doi: 10.1186/1471-2288-11-94.
Estimating the required sample size and statistical power for a study is an integral part of study design. For standard designs, power equations provide an efficient solution to the problem, but they are unavailable for many complex study designs that arise in practice. For such complex study designs, computer simulation is a useful alternative for estimating study power. Although this approach is well known among statisticians, in our experience many epidemiologists and social scientists are unfamiliar with the technique. This article aims to address this knowledge gap.
We review an approach to estimate study power for individual- or cluster-randomized designs using computer simulation. This flexible approach arises naturally from the model used to derive conventional power equations, but extends those methods to accommodate arbitrarily complex designs. The method is universally applicable to a broad range of designs and outcomes, and we present the material in a way that is approachable for quantitative, applied researchers. We illustrate the method using two examples (one simple, one complex) based on sanitation and nutritional interventions to improve child growth.
We first show how simulation reproduces conventional power estimates for simple randomized designs over a broad range of sample scenarios to familiarize the reader with the approach. We then demonstrate how to extend the simulation approach to more complex designs. Finally, we discuss extensions to the examples in the article, and provide computer code to efficiently run the example simulations in both R and Stata.
Simulation methods offer a flexible option to estimate statistical power for standard and non-traditional study designs and parameters of interest. The approach we have described is universally applicable for evaluating study designs used in epidemiologic and social science research.
对于研究来说,估计所需的样本量和统计功效是研究设计的一个组成部分。对于标准设计,功效方程提供了一种有效的解决方案,但对于许多实际中出现的复杂研究设计,这些方程并不适用。对于这种复杂的研究设计,计算机模拟是估计研究功效的一种有用替代方法。尽管这种方法在统计学家中广为人知,但根据我们的经验,许多流行病学家和社会科学家对该技术并不熟悉。本文旨在填补这一知识空白。
我们回顾了一种使用计算机模拟估计个体或群组随机设计研究功效的方法。这种灵活的方法自然源于用于推导常规功效方程的模型,但将这些方法扩展到了可容纳任意复杂设计的程度。该方法普遍适用于广泛的设计和结果,我们以一种易于理解的方式呈现了材料,以适合定量、应用研究人员使用。我们使用两个基于卫生和营养干预措施以改善儿童生长的示例(一个简单,一个复杂)来说明该方法。
我们首先展示了模拟如何在广泛的样本场景下复制简单随机设计的常规功效估计,以使读者熟悉该方法。然后,我们展示了如何将模拟方法扩展到更复杂的设计。最后,我们讨论了本文中示例的扩展,并提供了在 R 和 Stata 中高效运行示例模拟的计算机代码。
模拟方法为标准和非传统研究设计以及感兴趣的参数提供了一种灵活的估计统计功效的选择。我们所描述的方法普遍适用于评估在流行病学和社会科学研究中使用的研究设计。