Hayward Rodney A, Kent David M, Vijan Sandeep, Hofer Timothy P
Department of Veterans Affairs, VA Center for Practice Management & Outcomes Research, VA Ann Arbor Healthcare System, Ann Arbor, MI, USA.
BMC Med Res Methodol. 2006 Apr 13;6:18. doi: 10.1186/1471-2288-6-18.
When subgroup analyses of a positive clinical trial are unrevealing, such findings are commonly used to argue that the treatment's benefits apply to the entire study population; however, such analyses are often limited by poor statistical power. Multivariable risk-stratified analysis has been proposed as an important advance in investigating heterogeneity in treatment benefits, yet no one has conducted a systematic statistical examination of circumstances influencing the relative merits of this approach vs. conventional subgroup analysis.
Using simulated clinical trials in which the probability of outcomes in individual patients was stochastically determined by the presence of risk factors and the effects of treatment, we examined the relative merits of a conventional vs. a "risk-stratified" subgroup analysis under a variety of circumstances in which there is a small amount of uniformly distributed treatment-related harm. The statistical power to detect treatment-effect heterogeneity was calculated for risk-stratified and conventional subgroup analysis while varying: 1) the number, prevalence and odds ratios of individual risk factors for risk in the absence of treatment, 2) the predictiveness of the multivariable risk model (including the accuracy of its weights), 3) the degree of treatment-related harm, and 5) the average untreated risk of the study population.
Conventional subgroup analysis (in which single patient attributes are evaluated "one-at-a-time") had at best moderate statistical power (30% to 45%) to detect variation in a treatment's net relative risk reduction resulting from treatment-related harm, even under optimal circumstances (overall statistical power of the study was good and treatment-effect heterogeneity was evaluated across a major risk factor [OR = 3]). In some instances a multi-variable risk-stratified approach also had low to moderate statistical power (especially when the multivariable risk prediction tool had low discrimination). However, a multivariable risk-stratified approach can have excellent statistical power to detect heterogeneity in net treatment benefit under a wide variety of circumstances, instances under which conventional subgroup analysis has poor statistical power.
These results suggest that under many likely scenarios, a multivariable risk-stratified approach will have substantially greater statistical power than conventional subgroup analysis for detecting heterogeneity in treatment benefits and safety related to previously unidentified treatment-related harm. Subgroup analyses must always be well-justified and interpreted with care, and conventional subgroup analyses can be useful under some circumstances; however, clinical trial reporting should include a multivariable risk-stratified analysis when an adequate externally-developed risk prediction tool is available.
当一项阳性临床试验的亚组分析没有得出明确结果时,这些结果通常被用来证明该治疗的益处适用于整个研究人群;然而,此类分析往往受到统计效力不足的限制。多变量风险分层分析已被视为在研究治疗益处的异质性方面的一项重要进展,但尚未有人对影响该方法与传统亚组分析相对优势的情况进行系统的统计学检验。
通过模拟临床试验,其中个体患者的结局概率由危险因素的存在和治疗效果随机确定,我们在存在少量均匀分布的与治疗相关的危害的各种情况下,检验了传统亚组分析与“风险分层”亚组分析的相对优势。在以下因素变化时,计算风险分层和传统亚组分析检测治疗效应异质性的统计效力:1)未治疗时个体风险因素的数量、患病率和比值比,2)多变量风险模型的预测能力(包括其权重的准确性),3)与治疗相关的危害程度,以及5)研究人群的平均未治疗风险。
传统亚组分析(其中逐个评估单个患者属性)在检测因与治疗相关的危害导致的治疗净相对风险降低的变化方面,即使在最佳情况下(研究的总体统计效力良好且在一个主要风险因素上评估治疗效应异质性[OR = 3]),其统计效力充其量也只是中等(30%至45%)。在某些情况下,多变量风险分层方法的统计效力也较低至中等(特别是当多变量风险预测工具的区分能力较低时)。然而,在多种情况下,多变量风险分层方法在检测净治疗益处的异质性方面可具有出色的统计效力,而在这些情况下传统亚组分析的统计效力较差。
这些结果表明,在许多可能的情况下,对于检测与先前未识别的与治疗相关的危害相关的治疗益处和安全性的异质性,多变量风险分层方法的统计效力将大大高于传统亚组分析。亚组分析必须始终有充分的理由并谨慎解释,传统亚组分析在某些情况下可能有用;然而,当有合适的外部开发的风险预测工具时,临床试验报告应包括多变量风险分层分析。