Biostatistics Unit, Gertner Institute for Epidemiology and Health Policy Research, Israel.
Stat Med. 2010 Jan 15;29(1):97-107. doi: 10.1002/sim.3728.
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
对于具有一个正态分布连续协变量的简单逻辑回归(LR),不同的样本量计算方法会给出不同的结果。有时差异可能很大。此外,一些方法要求用户指定协变量等于其总体平均值时的病例患病率,而不是更自然的总体患病率。我们关注两种常用的方法,并通过模拟表明,对于一种方法,给定样本量的功效可能与名义值有很大差异,尤其是当协变量效应较大时,而另一种方法如果用户提供总体患病率而不是所需参数,则表现不佳。我们提出了一种对 Hsieh 等人的方法的修改,该方法需要指定总体患病率,并采用 Schouten 的不等方差和组大小 t 检验的样本量公式。这种方法似乎可以提高具有一个连续协变量的 LR 的样本量估计的准确性。