Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Boston, MA 02111, USA; Department of Medical Statistics, Leiden University Medical Center, Albinusdreef 2, Leiden 2333 ZA, The Netherlands.
Department of Medical Statistics, Leiden University Medical Center, Albinusdreef 2, Leiden 2333 ZA, The Netherlands; Department of Public Health, Erasmus University Medical Center, 's-Gravendijkwal 230, Rotterdam 3015 CE, The Netherlands.
J Clin Epidemiol. 2018 Feb;94:59-68. doi: 10.1016/j.jclinepi.2017.10.021. Epub 2017 Nov 11.
Clinical prediction models that support treatment decisions are usually evaluated for their ability to predict the risk of an outcome rather than treatment benefit-the difference between outcome risk with vs. without therapy. We aimed to define performance metrics for a model's ability to predict treatment benefit.
We analyzed data of the Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) trial and of three recombinant tissue plasminogen activator trials. We assessed alternative prediction models with a conventional risk concordance-statistic (c-statistic) and a novel c-statistic for benefit. We defined observed treatment benefit by the outcomes in pairs of patients matched on predicted benefit but discordant for treatment assignment. The 'c-for-benefit' represents the probability that from two randomly chosen matched patient pairs with unequal observed benefit, the pair with greater observed benefit also has a higher predicted benefit.
Compared to a model without treatment interactions, the SYNTAX score II had improved ability to discriminate treatment benefit (c-for-benefit 0.590 vs. 0.552), despite having similar risk discrimination (c-statistic 0.725 vs. 0.719). However, for the simplified stroke-thrombolytic predictive instrument (TPI) vs. the original stroke-TPI, the c-for-benefit (0.584 vs. 0.578) was similar.
The proposed methodology has the potential to measure a model's ability to predict treatment benefit not captured with conventional performance metrics.
支持治疗决策的临床预测模型通常根据其预测结局风险的能力进行评估,而不是根据治疗获益(即治疗与不治疗之间的结局风险差异)进行评估。我们旨在为模型预测治疗获益的能力定义性能指标。
我们分析了 SYNTAX 试验和三种重组组织型纤溶酶原激活剂试验的数据。我们使用传统风险一致性统计量(c 统计量)和新的获益 c 统计量评估了替代预测模型。我们通过预测获益匹配但治疗分配不一致的患者对的结局来定义观察到的治疗获益。“获益 c 统计量”表示从两个随机选择的预测获益不等但治疗分配不一致的匹配患者对中,观察到获益更大的患者对也具有更高预测获益的概率。
与没有治疗相互作用的模型相比,SYNTAX 评分 II 具有更好的辨别治疗获益的能力(获益 c 统计量 0.590 与 0.552),尽管风险判别能力相似(c 统计量 0.725 与 0.719)。然而,对于简化的卒中溶栓预测工具(TPI)与原始卒中-TPI,获益 c 统计量(0.584 与 0.578)相似。
所提出的方法学有可能衡量模型预测治疗获益的能力,而这是传统性能指标无法捕捉到的。