Sertdemir Y, Burgut R
Department of Biostatistics, Cukurova University School of Medicine, Balcali-Adana, Turkey.
Contemp Clin Trials. 2009 Jan;30(1):8-12. doi: 10.1016/j.cct.2008.08.006. Epub 2008 Sep 9.
In recent years the use of surrogate end points (S) has become an interesting issue. In clinical trials, it is important to get treatment outcomes as early as possible. For this reason there is a need for surrogate endpoints (S) which are measured earlier than the true endpoint (T). However, before a surrogate endpoint can be used it must be validated. For a candidate surrogate endpoint, for example time to recurrence, the validation result may change dramatically between clinical trials. The aim of this study is to show how the validation criterion (R(2)(trial)) proposed by Buyse et al. are influenced by the magnitude of treatment effect with an application using real data.
The criterion R(2)(trial) proposed by Buyse et al. (2000) is applied to the four data sets from colon cancer clinical trials (C-01, C-02, C-03 and C-04). Each clinical trial is analyzed separately for treatment effect on survival (true endpoint) and recurrence free survival (surrogate endpoint) and this analysis is done also for each center in each trial. Results are used for standard validation analysis. The centers were grouped by the Wald statistic in 3 equal groups.
Validation criteria R(2)(trial) were 0.641 95% CI (0.432-0.782), 0.223 95% CI (0.008-0.503), 0.761 95% CI (0.550-0.872) and 0.560 95% CI (0.404-0.687) for C-01, C-02, C-03 and C-04 respectively. The R(2)(trial) criteria changed by the Wald statistics observed for the centers used in the validation process. Higher the Wald statistic groups are higher the R(2)(trial) values observed.
The recurrence free survival is not a good surrogate for overall survival in clinical trials with non significant treatment effects and moderate for significant treatment effects. This shows that the level of significance of treatment effect should be taken into account in validation process of surrogate endpoints.
近年来,替代终点(S)的使用已成为一个有趣的问题。在临床试验中,尽早获得治疗结果非常重要。因此,需要比真正终点(T)更早测量的替代终点(S)。然而,在使用替代终点之前,必须对其进行验证。对于候选替代终点,例如复发时间,验证结果在不同临床试验之间可能会有显著变化。本研究的目的是通过实际数据应用展示Buyse等人提出的验证标准(R(2)(trial))如何受到治疗效果大小的影响。
将Buyse等人(2000年)提出的标准R(2)(trial)应用于来自结肠癌临床试验的四个数据集(C - 01、C - 02、C - 03和C - 04)。对每个临床试验分别分析其对生存(真正终点)和无复发生存(替代终点)的治疗效果,并且对每个试验中的每个中心也进行此分析。结果用于标准验证分析。根据Wald统计量将中心分为3个相等的组。
C - 01、C - 02、C - 03和C - 04的验证标准R(2)(trial)分别为0.641 95%置信区间(0.432 - 0.782)、0.223 95%置信区间(0.008 - 0.503)、0.761 95%置信区间(0.550 - 0.872)和0.560 95%置信区间(0.404 - 0.687)。R(2)(trial)标准因验证过程中使用的中心的Wald统计量而变化。Wald统计量组越高,观察到的R(2)(trial)值越高。
在治疗效果不显著的临床试验中,无复发生存不是总生存的良好替代指标,而在治疗效果显著时为中等替代指标。这表明在替代终点的验证过程中应考虑治疗效果的显著水平。