Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, USA.
Center for International Blood and Marrow Transplant Research, Milwaukee, WI, USA.
Clin Trials. 2024 Apr;21(2):152-161. doi: 10.1177/17407745231203391. Epub 2023 Oct 25.
BACKGROUND/AIMS: Protecting patient safety is an essential component of the conduct of clinical trials. Rigorous safety monitoring schemes are implemented for these studies to guard against excess toxicity risk from study therapies. They often include protocol-specified stopping rules dictating that an excessive number of safety events will trigger a halt of the study. Statistical methods are useful for constructing rules that protect patients from exposure to excessive toxicity while also maintaining the chance of a false safety signal at a low level. Several statistical techniques have been proposed for this purpose, but the current literature lacks a rigorous comparison to determine which method may be best suitable for a given trial design. The aims of this article are (1) to describe a general framework for repeated monitoring of safety events in clinical trials; (2) to survey common statistical techniques for creating safety stopping criteria; and (3) to provide investigators with a software tool for constructing and assessing these stopping rules.
The properties and operating characteristics of stopping rules produced by Pocock and O'Brien-Fleming tests, Bayesian Beta-Binomial models, and sequential probability ratio tests (SPRTs) are studied and compared for common scenarios that may arise in phase II and III trials. We developed the R package "stoppingrule" for constructing and evaluating stopping rules from these methods. Its usage is demonstrated through a redesign of a stopping rule for BMT CTN 0601 (registered at Clinicaltrials.gov as NCT00745420), a phase II, single-arm clinical trial that evaluated outcomes in pediatric sickle cell disease patients treated by bone marrow transplant.
Methods with aggressive stopping criteria early in the trial, such as the Pocock test and Bayesian Beta-Binomial models with weak priors, have permissive stopping criteria at late stages. This results in a trade-off where rules with aggressive early monitoring generally will have a smaller number of expected toxicities but also lower power than rules with more conservative early stopping, such as the O-Brien-Fleming test and Beta-Binomial models with strong priors. The modified SPRT method is sensitive to the choice of alternative toxicity rate. The maximized SPRT generally has a higher number of expected toxicities and/or worse power than other methods.
Because the goal is to minimize the number of patients exposed to and experiencing toxicities from an unsafe therapy, we recommend using the Pocock or Beta-Binomial, weak prior methods for constructing safety stopping rules. At the design stage, the operating characteristics of candidate rules should be evaluated under various possible toxicity rates in order to guide the choice of rule(s) for a given trial; our R package facilitates this evaluation.
背景/目的:保护患者安全是临床研究开展的重要组成部分。为防范研究治疗带来的过度毒性风险,这些研究实施了严格的安全性监测方案。这些方案通常包括规定了如果出现过多安全性事件,研究将停止的方案特定停止规则。统计学方法可用于制定规则,在保护患者免受过度毒性暴露的同时,还能将假安全性信号的发生概率保持在较低水平。为此目的已经提出了几种统计技术,但目前的文献缺乏严格的比较,以确定哪种方法最适合特定的试验设计。本文的目的是:(1)描述临床试验中安全性事件重复监测的一般框架;(2)调查用于制定安全性停止标准的常见统计技术;(3)为研究者提供构建和评估这些停止规则的软件工具。
研究和比较了 Pocock 和 O'Brien-Fleming 检验、贝叶斯 Beta-Binomial 模型和序贯概率比检验(SPRT)产生的停止规则的特性和操作特征,这些规则适用于可能出现在 II 期和 III 期试验中的常见情况。我们开发了 R 包“stoppingrule”,用于从这些方法中构建和评估停止规则。通过重新设计 BMT CTN 0601 的停止规则(在 Clinicaltrials.gov 上注册为 NCT00745420)演示了其用法,这是一项 II 期、单臂临床试验,评估了接受骨髓移植治疗的儿科镰状细胞病患者的结局。
在试验早期具有激进停止标准的方法(如 Pocock 检验和贝叶斯 Beta-Binomial 模型的弱先验)在后期具有宽松的停止标准。这导致一个权衡,即具有激进早期监测的规则通常具有较少的预期毒性,但比具有更保守早期停止的规则(如 O'Brien-Fleming 检验和贝叶斯 Beta-Binomial 模型的强先验)的效力更低。修改后的 SPRT 方法对替代毒性率的选择敏感。最大 SPRT 通常比其他方法具有更多的预期毒性和/或更低的效力。
因为目标是尽量减少患者接触不安全治疗并经历毒性的人数,所以我们建议使用 Pocock 或 Beta-Binomial、弱先验方法来构建安全性停止规则。在设计阶段,应根据各种可能的毒性率评估候选规则的操作特性,以指导为特定试验选择规则;我们的 R 包方便了这种评估。