Li Zhigang, McKeague Ian W
Dartmouth College.
Columbia University.
Stat Sin. 2013 Jan 1;23(1):231-250. doi: 10.5705/ss.2011.081.
We consider the problem of calculating power and sample size for tests based on generalized estimating equations (GEE), that arise in studies involving clustered or correlated data (e.g., longitudinal studies and sibling studies). Previous approaches approximate the power of such tests using the asymptotic behavior of the test statistics under alternatives. We develop a more accurate approach in which the asymptotic behavior is studied under a sequence of alternatives that converge to the null hypothesis at root- rate, where is the number of clusters. Based on this approach, explicit sample size formulae are derived for Wald and quasi-score test statistics in a variety of GEE settings. Simulation results show that in the important special case of logistic regression with exchangeable correlation structure, previous approaches can inflate the projected sample size (to obtain nominal 90% power using the Wald statistic) by over 10%, whereas the proposed approach provides an accuracy of around 2%.
我们考虑基于广义估计方程(GEE)的检验的功效计算和样本量确定问题,这类问题出现在涉及聚类或相关数据的研究中(例如纵向研究和同胞研究)。以往的方法利用备择假设下检验统计量的渐近行为来近似此类检验的功效。我们开发了一种更精确的方法,其中在一系列以根速率收敛于原假设的备择假设下研究渐近行为,这里的是聚类的数量。基于此方法,在各种GEE设置下,为Wald检验统计量和拟得分检验统计量推导了明确的样本量公式。模拟结果表明,在具有可交换相关结构的逻辑回归这一重要特殊情况下,以往的方法可能会使预计样本量(使用Wald统计量获得名义上90%的功效)膨胀超过10%,而所提出的方法提供的精度约为2%。