1 Division of Biostatistics, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA.
2 Department of Statistical Science, Southern Methodist University, Dallas, TX, USA.
Stat Methods Med Res. 2019 Feb;28(2):589-598. doi: 10.1177/0962280217731595. Epub 2017 Sep 28.
Paired experimental design is widely used in clinical and health behavioral studies, where each study unit contributes a pair of observations. Investigators often encounter incomplete observations of paired outcomes in the data collected. Some study units contribute complete pairs of observations, while the others contribute either pre- or post-intervention observations. Statistical inference for paired experimental design with incomplete observations of continuous outcomes has been extensively studied in literature. However, sample size method for such study design is sparsely available. We derive a closed-form sample size formula based on the generalized estimating equation approach by treating the incomplete observations as missing data in a linear model. The proposed method properly accounts for the impact of mixed structure of observed data: a combination of paired and unpaired outcomes. The sample size formula is flexible to accommodate different missing patterns, magnitude of missingness, and correlation parameter values. We demonstrate that under complete observations, the proposed generalized estimating equation sample size estimate is the same as that based on the paired t-test. In the presence of missing data, the proposed method would lead to a more accurate sample size estimate comparing with the crude adjustment. Simulation studies are conducted to evaluate the finite-sample performance of the generalized estimating equation sample size formula. A real application example is presented for illustration.
配对实验设计广泛应用于临床和健康行为研究中,每个研究单位都提供一对观察结果。研究人员在收集的数据中经常会遇到配对结果的不完全观察。一些研究单位提供完整的配对观察结果,而另一些则提供干预前后的观察结果。对于连续结果的不完全观察配对实验设计的统计推断在文献中已有广泛研究。然而,对于这种研究设计的样本量方法却很少。我们通过在线性模型中将不完全观察视为缺失数据,基于广义估计方程方法推导出一个闭式样本量公式。该方法适当考虑了观察数据混合结构的影响:配对和非配对结果的组合。样本量公式灵活适用于不同的缺失模式、缺失程度和相关参数值。我们证明,在完全观察的情况下,所提出的广义估计方程样本量估计与基于配对 t 检验的样本量估计相同。在存在缺失数据的情况下,与简单调整相比,该方法将导致更准确的样本量估计。模拟研究评估了广义估计方程样本量公式的有限样本性能。最后,通过一个实际应用示例来说明。