Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Jordi Girona, 31, Barcelona, 08034, Spain.
Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam.
BMC Med Res Methodol. 2021 May 6;21(1):99. doi: 10.1186/s12874-021-01286-x.
Sample size calculation is a key point in the design of a randomized controlled trial. With time-to-event outcomes, it's often based on the logrank test. We provide a sample size calculation method for a composite endpoint (CE) based on the geometric average hazard ratio (gAHR) in case the proportional hazards assumption can be assumed to hold for the components, but not for the CE.
The required number of events, sample size and power formulae are based on the non-centrality parameter of the logrank test under the alternative hypothesis which is a function of the gAHR. We use the web platform, CompARE, for the sample size computations. A simulation study evaluates the empirical power of the logrank test for the CE based on the sample size in terms of the gAHR. We consider different values of the component hazard ratios, the probabilities of observing the events in the control group and the degrees of association between the components. We illustrate the sample size computations using two published randomized controlled trials. Their primary CEs are, respectively, progression-free survival (time to progression of disease or death) and the composite of bacteriologically confirmed treatment failure or Staphylococcus aureus related death by 12 weeks.
For a target power of 0.80, the simulation study provided mean (± SE) empirical powers equal to 0.799 (±0.004) and 0.798 (±0.004) in the exponential and non-exponential settings, respectively. The power was attained in more than 95% of the simulated scenarios and was always above 0.78, regardless of compliance with the proportional-hazard assumption.
The geometric average hazard ratio as an effect measure for a composite endpoint has a meaningful interpretation in the case of non-proportional hazards. Furthermore it is the natural effect measure when using the logrank test to compare the hazard rates of two groups and should be used instead of the standard hazard ratio.
样本量计算是随机对照试验设计的关键点。对于生存时间结局,通常基于对数秩检验。我们提供了一种基于复合终点(CE)的几何平均风险比(gAHR)的样本量计算方法,假设各组成部分符合比例风险假设,但 CE 不符合。
所需事件数量、样本量和功效公式基于替代假设下对数秩检验的非中心参数,该参数是 gAHR 的函数。我们使用网络平台 CompARE 进行样本量计算。一项模拟研究根据 gAHR 对 CE 的样本量评估了对数秩检验的经验功效。我们考虑了不同的组件风险比、对照组观察到事件的概率以及组件之间的关联程度。我们使用两项已发表的随机对照试验来说明样本量计算。它们的主要 CE 分别是无进展生存期(疾病进展或死亡的时间)和 12 周时细菌学确认治疗失败或金黄色葡萄球菌相关死亡的复合 CE。
对于目标功效 0.80,模拟研究在指数和非指数环境下分别提供了平均(± SE)经验功效 0.799(±0.004)和 0.798(±0.004)。在超过 95%的模拟场景中实现了功效,并且无论是否符合比例风险假设,功效始终高于 0.78。
在非比例风险的情况下,复合终点的几何平均风险比作为效应量具有有意义的解释。此外,当使用对数秩检验比较两组的风险率时,它是自然的效应量,应替代标准风险比。