Wang Cuiling, Hall Charles B, Kim Mimi
Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, USA
Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, USA.
Stat Methods Med Res. 2015 Dec;24(6):1009-29. doi: 10.1177/0962280212437452. Epub 2012 Feb 21.
In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These 'naive' approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.
在许多纵向研究中,评估二元或连续预测变量对结果变化率(即斜率)的影响通常是主要关注点。然而,由于预期会因访视缺失、提前退出和交错入组而出现缺失数据,这些研究的样本量确定变得复杂。尽管有评估纵向研究中缺失数据时检验效能的方法,但对于研究人群中缺失数据的大小和分布对检验效能的影响仍知之甚少。因此,通常会简单但错误地改变用于完全/平衡数据的样本量公式。这些“简单”方法包括均方和法与平均受试者数量法。本文的目的是更详细地探讨缺失数据对研究检验效能的影响,并使用数学方法和基于模拟的方法,将简单样本量方法的性能与基于正确最大似然法的方法进行比较。使用两项不同的纵向衰老研究来说明这些方法。