Liu Fang
Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA.
Biometrics. 2013 Jun;69(2):530-6. doi: 10.1111/biom.12026. Epub 2013 May 31.
We reexamine the subject of sample size determination (SSD) when testing logarithm of odds ratio (OR) against zero in two independent binomials. Four common approaches are considered: a closed-form SS formula based on the Wald test (nW), closed-form formulas that meet SS requirement by score and exact tests respectively (nS and nE), and a numerical approach to calculating SS based on likelihood ratio (LR) tests (nL). Several practically useful findings are presented. First, nW is a strictly convex function of OR for OR >1 and OR <1, respectively, implying that SS calculated by nW does not necessarily decrease as OR gets further away from 1. However, minimum SS often occurs at OR values that are deemed relatively extreme and rare in real life. nS, nE, and nL decrease monotonically as OR diverges from 1. Secondly, the optimal sampling ratio (OSR) between two independent binomials that yields maximum power for a given total SS is not always 1:1 but depends on the odds of outcome in each arm. nW benefits the most from the application of OSR in that total SS can be significantly reduced as compared to the commonly used 1:1 sampling ratio. Savings in SS by OSR in nS, nL and nE are relatively immaterial from a practical perspective. Finally, we use simulation studies to examine the power loyalty of each SS approach and explore penalized likelihood as a remedy for undermined power loyalty.
我们重新审视在两个独立二项分布中针对零检验比值比(OR)的对数时样本量确定(SSD)的问题。考虑了四种常见方法:基于Wald检验的闭式样本量公式(nW)、分别通过得分检验和精确检验满足样本量要求的闭式公式(nS和nE),以及基于似然比(LR)检验计算样本量的数值方法(nL)。给出了几个实际有用的发现。首先,对于OR > 1和OR < 1,nW分别是OR的严格凸函数,这意味着通过nW计算的样本量不一定随着OR远离1而减小。然而,最小样本量通常出现在现实生活中被认为相对极端和罕见的OR值处。随着OR偏离1,nS、nE和nL单调递减。其次,对于给定的总样本量产生最大检验效能的两个独立二项分布之间的最优抽样比(OSR)并不总是1:1,而是取决于每个组中结局的概率。与常用的1:1抽样比相比,应用OSR时nW受益最大,因为总样本量可以显著减少。从实际角度来看,OSR在nS、nL和nE中节省的样本量相对微不足道。最后,我们使用模拟研究来检验每种样本量方法的检验效能忠诚度,并探索惩罚似然作为解决检验效能忠诚度不足的一种补救方法。