Institute of Medical Biometry (IMBI), Department Medical Biometry, Heidelberg University, Heidelberg, Germany.
Med Decis Making. 2024 May;44(4):365-379. doi: 10.1177/0272989X241239928. Epub 2024 May 9.
For time-to-event endpoints, three additional benefit assessment methods have been developed aiming at an unbiased knowledge about the magnitude of clinical benefit of newly approved treatments. The American Society of Clinical Oncology (ASCO) defines a continuous score using the hazard ratio point estimate (HR-PE). The European Society for Medical Oncology (ESMO) and the German Institute for Quality and Efficiency in Health Care (IQWiG) developed methods with an ordinal outcome using lower and upper limits of the 95% HR confidence interval (HR-CI), respectively. We describe all three frameworks for additional benefit assessment aiming at a fair comparison across different stakeholders. Furthermore, we determine which ASCO score is consistent with which ESMO/IQWiG category.
In a comprehensive simulation study with different failure time distributions and treatment effects, we compare all methods using Spearman's correlation and descriptive measures. For determination of ASCO values consistent with categories of ESMO/IQWiG, maximizing weighted Cohen's Kappa approach was used.
Our research depicts a high positive relationship between ASCO/IQWiG and a low positive relationship between ASCO/ESMO. An ASCO score smaller than 17, 17 to 20, 20 to 24, and greater than 24 corresponds to ESMO categories. Using ASCO values of 21 and 38 as cutoffs represents IQWiG categories.
We investigated the statistical aspects of the methods and hence implemented slightly reduced versions of all methods.
IQWiG and ASCO are more conservative than ESMO, which often awards the maximal category independent of the true effect and is at risk of overcompensating with various failure time distributions. ASCO has similar characteristics as IQWiG. Delayed treatment effects and underpowered/overpowered studies influence all methods in some degree. Nevertheless, ESMO is the most liberal one.
For the additional benefit assessment, the American Society of Clinical Oncology (ASCO) uses the hazard ratio point estimate (HR-PE) for their continuous score. In contrast, the European Society for Medical Oncology (ESMO) and the German Institute for Quality and Efficiency in Health Care (IQWiG) use the lower and upper 95% HR confidence interval (HR-CI) to specific thresholds, respectively. ESMO generously assigns maximal scores, while IQWiG is more conservative.This research provides the first comparison between IQWiG and ASCO and describes all three frameworks for additional benefit assessment aiming for a fair comparison across different stakeholders. Furthermore, thresholds for ASCO consistent with ESMO and IQWiG categories are determined, enabling a comparison of the methods in practice in a fair manner.IQWiG and ASCO are the more conservative methods, while ESMO awards high percentages of maximal categories, especially with various failure time distributions. ASCO has similar characteristics as IQWiG. Delayed treatment effects and under/-overpowered studies influence all methods. Nevertheless, ESMO is the most liberal one. An ASCO score smaller than 17, 17 to 20, 20 to 24, and greater than 24 correspond to the categories of ESMO. Using ASCO values of 21 and 38 as cutoffs represents categories of IQWiG.
对于时间事件终点,已经开发了三种额外的获益评估方法,旨在公正地了解新批准治疗的临床获益的程度。美国临床肿瘤学会(ASCO)使用危险比点估计(HR-PE)定义了一个连续评分。欧洲肿瘤内科学会(ESMO)和德国国家质量与效率研究所(IQWiG)分别使用下限和上限的 95%HR 置信区间(HR-CI)开发了具有序数结局的方法。我们描述了所有三种额外获益评估框架,旨在为不同利益相关者之间的公平比较提供依据。此外,我们确定了 ASCO 评分与 ESMO/IQWiG 类别之间的一致性。
在一项具有不同失效时间分布和治疗效果的综合模拟研究中,我们使用 Spearman 相关系数和描述性指标比较了所有方法。为了确定与 ESMO/IQWiG 类别一致的 ASCO 值,使用了加权 Cohen Kappa 最大化方法。
我们的研究表明,ASCO/IQWiG 之间具有高度正相关,而 ASCO/ESMO 之间具有低度正相关。ASCO 评分小于 17、17 至 20、20 至 24 和大于 24 分别对应于 ESMO 类别。使用 ASCO 值 21 和 38 作为截断值代表 IQWiG 类别。
我们研究了方法的统计方面,因此实施了所有方法的略微简化版本。
与 ESMO 相比,IQWiG 和 ASCO 更为保守,ESMO 通常授予最大类别,而不考虑真实效果,并且存在过度补偿各种失效时间分布的风险。ASCO 具有与 IQWiG 相似的特征。延迟的治疗效果和欠/过功率研究在某种程度上影响了所有方法。然而,ESMO 是最自由的一个。
对于额外的获益评估,美国临床肿瘤学会(ASCO)使用危险比点估计(HR-PE)为其连续评分。相比之下,欧洲肿瘤内科学会(ESMO)和德国国家质量与效率研究所(IQWiG)分别使用下限和上限的 95%HR 置信区间(HR-CI)到特定的阈值。ESMO 慷慨地授予最高分数,而 IQWiG 则更为保守。本研究首次比较了 IQWiG 和 ASCO,并描述了所有三种额外获益评估框架,旨在为不同利益相关者之间的公平比较提供依据。此外,确定了与 ESMO 和 IQWiG 类别一致的 ASCO 阈值,使在实践中以公平的方式比较这些方法成为可能。IQWiG 和 ASCO 是更为保守的方法,而 ESMO 授予高比例的最大类别,特别是在各种失效时间分布的情况下。ASCO 具有与 IQWiG 相似的特征。延迟的治疗效果和欠/过功率研究影响了所有方法。然而,ESMO 是最自由的一个。ASCO 评分小于 17、17 至 20、20 至 24 和大于 24 分别对应于 ESMO 的类别。使用 ASCO 值 21 和 38 作为截断值代表 IQWiG 类别。