Antolini Laura, Boracchi Patrizia, Biganzoli Elia
Unità di Statistica Medica e Biometria, Istituto Nazionale per lo Studio e la Cura dei Tumori di Milano, Via Venezian 1, 20133 Milano, Italy.
Stat Med. 2005 Dec 30;24(24):3927-44. doi: 10.1002/sim.2427.
To derive models suitable for outcome prediction, a crucial aspect is the availability of appropriate measures of predictive accuracy, which have to be usable for a general class of models. The Harrell's C discrimination index is an extension of the area under the ROC curve to the case of censored survival data, which owns a straightforward interpretability. For a model including covariates with time-dependent effects and/or time-dependent covariates, the original definition of C would require the prediction of individual failure times, which is not generally addressed in most clinical applications. Here we propose a time-dependent discrimination index Ctd where the whole predicted survival function is utilized as outcome prediction, and the ability to discriminate among subjects having different outcome is summarized over time. Ctd is based on a novel definition of concordance: a subject who developed the event should have a less predicted probability of surviving beyond his/her survival time than any subject who survived longer. The predicted survival function of a subject who developed the event is compared to: (1) that of subjects who developed the event before his/her survival time, and (2) that of subjects who developed the event, or were censored, after his/her survival time. Subjects who were censored are involved in comparisons with subjects who developed the event before their observed times. The index reduces to the previous C in the presence of separation between survival curves on the whole follow-up. A confidence interval for Ctd is derived using the jackknife method on correlated one-sample U-statistics.The proposed index is used to evaluate the discrimination ability of a model, including covariates having time-dependent effects, concerning time to relapse in breast cancer patients treated with adjuvant tamoxifen. The model was obtained from 596 patients entered prospectively at Istituto Nazionale per lo Studio e la Cura dei Tumori di Milano (INT). The model discrimination ability was validated on an independent testing data set of 175 patients provided by Centro Regionale Indicatori Biochimici di Tumore (CRIBT) in Venice.
为了推导适用于结果预测的模型,一个关键方面是要有合适的预测准确性度量指标,这些指标必须适用于一般类别的模型。哈雷尔C判别指数是将ROC曲线下面积扩展到删失生存数据的情况,它具有直接的可解释性。对于包含具有时间依赖性效应的协变量和/或时间依赖性协变量的模型,C的原始定义需要预测个体失效时间,而这在大多数临床应用中通常并未涉及。在此,我们提出一个时间依赖性判别指数Ctd,其中整个预测生存函数被用作结果预测,并且在不同时间总结区分具有不同结果的受试者的能力。Ctd基于一种新的一致性定义:发生事件的受试者在其生存时间之后存活的预测概率应低于任何存活时间更长的受试者。将发生事件的受试者的预测生存函数与以下两者进行比较:(1)在其生存时间之前发生事件的受试者的预测生存函数,以及(2)在其生存时间之后发生事件或被删失的受试者的预测生存函数。被删失的受试者参与与在其观察时间之前发生事件的受试者的比较。在整个随访期间生存曲线存在分离的情况下,该指数简化为先前的C。使用关于相关单样本U统计量的刀切法得出Ctd的置信区间。所提出的指数用于评估一个模型的判别能力,该模型包括具有时间依赖性效应的协变量,涉及接受辅助他莫昔芬治疗的乳腺癌患者的复发时间。该模型来自于米兰国立肿瘤研究所(INT)前瞻性纳入的596例患者。模型判别能力在威尼斯的区域肿瘤生物化学指标中心(CRIBT)提供的175例患者的独立测试数据集上得到验证。