Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.
Merck & Co Inc, Kenilworth, New Jersey.
JAMA Netw Open. 2021 May 3;4(5):e218175. doi: 10.1001/jamanetworkopen.2021.8175.
Phase 2 trials and early efficacy end points play a crucial role in informing decisions about whether to continue to phase 3 trials. Conventional end points, such as objective response rate (ORR) and progression-free survival (PFS), have demonstrated inconsistent associations with overall survival (OS) benefits in immune checkpoint inhibitor (ICI) trials. Restricted mean duration of response (DOR) is a rigorous metric that combines both response status and duration information. However, its utility in clinical development has not been comprehensively explored.
To determine whether using restricted mean DOR in phase 2 trials can advance promising regimens to phase 3 trials sooner and eliminate unfavorable regimens earlier and with a higher degree of confidence compared with PFS and ORR.
DESIGN, SETTING, AND PARTICIPANTS: This simulated modeling study randomized phase 2 screening trials by resampling 1376 patients from 2 completed randomized phase 3 trials of ICIs. Data were analyzed from August 2019 to July 2020.
Use of ICIs.
Restricted mean DOR, PFS, ORR, and OS were estimated and compared between groups. Three scenarios were considered: (1) significant differences in OS, PFS, and ORR; (2) significant differences in OS and noticeable differences in ORR but not PFS; and (3) no differences in OS, PFS, or ORR. For each setting, 5000 randomized phase 2 trials with different sample sizes were simulated, with additional censoring applied to mimic staggered accruals and ensure fair comparisons between different analysis methods. Probabilities of concluding positive phase 2 trials using PFS, ORR, and DOR were summarized and compared.
The restricted mean DOR difference correctly estimated a positive OS benefit more frequently than did the ORR or PFS tests, across different sample sizes, significance levels, and censoring levels evaluated. When both OS and PFS differed, the ranges of true-positive or power rates were 79.2% to 98.7% for DOR, 56.3% to 93.2% for PFS, and 67.0% to 96.0% for ORR. When OS differed but PFS did not, the ranges of power rates were 24.0% to 76.0% for DOR, 3.0% to 19.0% for PFS, and 10.5% to 38.0% for ORR. When OS was similar, the false-positive rate of restricted mean DOR test was close to the chosen significance level.
These findings suggest that restricted mean DOR in randomized phase 2 trials is potentially more sensitive and useful than PFS and ORR in estimating the subsequent phase 3 conclusions and, thus, may be considered to complementarily facilitate decision-making in future clinical development.
2 期试验和早期疗效终点在决定是否继续进行 3 期试验方面起着至关重要的作用。在免疫检查点抑制剂 (ICI) 试验中,传统终点(如客观缓解率 (ORR) 和无进展生存期 (PFS))与总生存期 (OS) 获益的相关性不一致。受限平均缓解持续时间 (DOR) 是一种严格的指标,结合了反应状态和持续时间信息。然而,其在临床开发中的应用尚未得到全面探讨。
确定在 2 期试验中使用受限平均 DOR 是否可以更快地将有前途的方案推进到 3 期试验,并更早、更有信心地淘汰不利方案,与 PFS 和 ORR 相比。
设计、地点和参与者: 本模拟研究通过对 2 项已完成的 ICI 3 期随机试验中的 1376 名患者进行重新抽样,对 1376 名患者进行了随机分组。数据分析于 2019 年 8 月至 2020 年 7 月进行。
ICI 的使用。
在各组之间估计和比较受限平均 DOR、PFS、ORR 和 OS。考虑了三种情况:(1)OS、PFS 和 ORR 存在显著差异;(2)OS 存在显著差异,ORR 有明显差异但 PFS 没有;(3)OS、PFS 或 ORR 无差异。对于每种设置,模拟了不同样本量的 5000 项随机 2 期试验,并进行了额外的删失以模拟分期入组,以确保不同分析方法之间的公平比较。总结并比较了使用 PFS、ORR 和 DOR 得出阳性 2 期试验结论的概率。
在不同的样本量、显著性水平和删失水平下,受限平均 DOR 差异比 ORR 或 PFS 测试更能正确估计 OS 获益,其正确估计阳性 OS 获益的频率更高。当 OS 和 PFS 都不同时,DOR 的真阳性或功效率范围为 79.2%至 98.7%,PFS 的范围为 56.3%至 93.2%,ORR 的范围为 67.0%至 96.0%。当 OS 不同但 PFS 没有时,DOR 的功效率范围为 24.0%至 76.0%,PFS 的范围为 3.0%至 19.0%,ORR 的范围为 10.5%至 38.0%。当 OS 相似时,受限平均 DOR 测试的假阳性率接近所选的显著性水平。
这些发现表明,在随机 2 期试验中,受限平均 DOR 比 PFS 和 ORR 更敏感、更有用,可用于估计随后的 3 期结论,因此,可能被认为有助于补充未来临床开发中的决策。