Charité - Universitätsmedizin Berlin, Corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Biometry and Clinical Epidemiology, Berlin, Germany.
Biostatistics and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany.
Stat Methods Med Res. 2023 Sep;32(9):1749-1765. doi: 10.1177/09622802231188515. Epub 2023 Jul 25.
In oncology, phase II clinical trials are often planned as single-arm two-stage designs with a binary endpoint, for example, progression-free survival after 12 months, and the option to stop for futility after the first stage. Simon's two-stage design is a very popular approach but depending on the follow-up time required to measure the patients' outcomes the trial may have to be paused undesirably long. To shorten this forced interruption, it was proposed to use a short-term endpoint for the interim decision, such as progression-free survival after 3 months. We show that if the assumptions for the short-term endpoint are misspecified, the decision-making in the interim can be misleading, resulting in a great loss of statistical power. For the setting of a binary endpoint with nested measurements, such as progression-free survival, we propose two approaches that utilize all available short-term and long-term assessments of the endpoint to guide the interim decision. One approach is based on conditional power and the other is based on Bayesian posterior predictive probability of success. In extensive simulations, we show that both methods perform similarly, when appropriately calibrated, and can greatly improve power compared to the existing approach in settings with slow patient recruitment. Software code to implement the methods is made publicly available.
在肿瘤学中,II 期临床试验通常计划为单臂两阶段设计,具有二元终点,例如 12 个月后的无进展生存期,并且在第一阶段后有停止无效性的选择。Simon 的两阶段设计是一种非常流行的方法,但根据测量患者结果所需的随访时间,试验可能需要不期望地长时间暂停。为了缩短这种强制性中断,有人提议使用短期终点作为中期决策,例如 3 个月后的无进展生存期。我们表明,如果短期终点的假设指定不正确,中期决策可能会产生误导,导致统计功效大大损失。对于具有嵌套测量的二元终点设置,例如无进展生存期,我们提出了两种利用所有可用的短期和长期终点评估来指导中期决策的方法。一种方法基于条件功效,另一种方法基于成功的贝叶斯后验预测概率。在广泛的模拟中,我们表明,当适当校准时,这两种方法的性能相似,并且与患者招募缓慢的情况下现有的方法相比,可以大大提高功效。实现这些方法的软件代码已公开提供。