Jones Douglas W, Simons Jessica P, Osborne Nicholas H, Schermerhorn Marc, Dimick Justin B, Schanzer Andres
Division of Vascular and Endovascular Surgery, University of Massachusetts Medical Center, University of Massachusetts Chan Medical School, Worcester, MA.
Division of Vascular and Endovascular Surgery, University of Massachusetts Medical Center, University of Massachusetts Chan Medical School, Worcester, MA.
J Vasc Surg. 2024 Sep;80(3):715-723.e1. doi: 10.1016/j.jvs.2024.04.056. Epub 2024 Apr 30.
Cumulative, probability-based metrics are regularly used to measure quality in professional sports, but these methods have not been applied to health care delivery. These techniques have the potential to be particularly useful in describing surgical quality, where case volume is variable and outcomes tend to be dominated by statistical "noise." The established statistical technique used to adjust for differences in case volume is reliability-adjustment, which emphasizes statistical "signal" but has several limitations. We sought to validate a novel measure of surgical quality based on earned outcomes methods (deaths above average [DAA]) against reliability-adjusted mortality rates, using abdominal aortic aneurysm (AAA) repair outcomes to illustrate the measure's performance.
Earned outcomes methods were used to calculate the outcome of interest for each patient: DAA. Hospital-level DAA was calculated for non-ruptured open AAA repair and endovascular aortic repair (EVAR) in the Vascular Quality Initiative database from 2016 to 2019. DAA for each center is the sum of observed - predicted risk of death for each patient; predicted risk of death was calculated using established multivariable logistic regression modeling. Correlations of DAA with reliability-adjusted mortality rates and procedure volume were determined. Because an accurate quality metric should correlate with future results, outcomes from 2016 to 2017 were used to categorize hospital quality based on: (1) risk-adjusted mortality; (2) risk- and reliability-adjusted mortality; and (3) DAA. The best performing quality metric was determined by comparing the ability of these categories to predict 2018 to 2019 risk-adjusted outcomes.
During the study period, 3734 patients underwent open repair (106 hospitals), and 20,680 patients underwent EVAR (183 hospitals). DAA was closely correlated with reliability-adjusted mortality rates for open repair (r = 0.94; P < .001) and EVAR (r = 0.99; P < .001). DAA also correlated with hospital case volume for open repair (r = -.54; P < .001), but not EVAR (r = 0.07; P = .3). In 2016 to 2017, most hospitals had 0% mortality (55% open repair, 57% EVAR), making it impossible to evaluate these hospitals using traditional risk-adjusted mortality rates alone. Further, zero mortality hospitals in 2016 to 2017 did not demonstrate improved outcomes in 2018 to 2019 for open repair (3.8% vs 4.6%; P = .5) or EVAR (0.8% vs 1.0%; P = .2) compared with all other hospitals. In contrast to traditional risk-adjustment, 2016 to 2017 DAA evenly divided centers into quality quartiles that predicted 2018 to 2019 performance with increased mortality rate associated with each decrement in quality quartile (Q1, 3.2%; Q2, 4.0%; Q3, 5.1%; Q4, 6.0%). There was a significantly higher risk of mortality at worst quartile open repair hospitals compared with best quartile hospitals (odds ratio, 2.01; 95% confidence interval, 1.07-3.76; P = .03). Using 2016 to 2019 DAA to define quality, highest quality quartile open repair hospitals had lower median DAA compared with lowest quality quartile hospitals (-1.18 DAA vs +1.32 DAA; P < .001), correlating with lower median reliability-adjusted mortality rates (3.6% vs 5.1%; P < .001).
Adjustment for differences in hospital volume is essential when measuring hospital-level outcomes. Earned outcomes accurately categorize hospital quality and correlate with reliability-adjustment but are easier to calculate and interpret. From 2016 to 2019, highest quality open AAA repair hospitals prevented >40 perioperative deaths compared with the average hospital, and >80 perioperative deaths compared with lowest quality hospitals.
基于累积概率的指标常用于衡量职业体育的质量,但这些方法尚未应用于医疗服务。这些技术在描述手术质量方面可能特别有用,因为病例数量可变,且结果往往受统计“噪声”主导。用于调整病例数量差异的既定统计技术是可靠性调整,它强调统计“信号”,但有几个局限性。我们试图通过使用腹主动脉瘤(AAA)修复结果来说明该指标的性能,验证一种基于获得性结果方法(高于平均水平的死亡数[DAA])的手术质量新指标,并与可靠性调整后的死亡率进行对比。
采用获得性结果方法计算每位患者的感兴趣结果:DAA。在血管质量倡议数据库中,计算2016年至2019年非破裂性开放性AAA修复和血管腔内主动脉修复(EVAR)的医院层面DAA。每个中心的DAA是每位患者观察到的 - 预测死亡风险之和;使用既定的多变量逻辑回归模型计算预测死亡风险。确定DAA与可靠性调整后的死亡率和手术量之间的相关性。由于准确的质量指标应与未来结果相关,因此使用2016年至2017年的结果基于以下因素对医院质量进行分类:(1)风险调整后的死亡率;(2)风险和可靠性调整后的死亡率;(3)DAA。通过比较这些类别预测2018年至2019年风险调整后结果的能力,确定表现最佳的质量指标。
在研究期间,3734例患者接受了开放性修复(106家医院),20680例患者接受了EVAR(183家医院)。开放性修复(r = 0.94;P <.001)和EVAR(r = 0.99;P <.001)的DAA与可靠性调整后的死亡率密切相关。开放性修复的DAA也与医院病例数量相关(r = -0.54;P <.001),但EVAR不相关(r = 0.07;P =.3)。在2016年至2017年,大多数医院的死亡率为0%(开放性修复为55%,EVAR为57%),这使得仅使用传统的风险调整后死亡率无法评估这些医院。此外,与所有其他医院相比,2016年至2017年死亡率为零的医院在2018年至2019年的开放性修复(3.8%对4.6%;P =.5)或EVAR(0.8%对1.0%;P =.2)中并未表现出更好的结果。与传统风险调整不同,2016年至2017年的DAA将中心均匀地分为质量四分位数,这些四分位数可预测2018年至2019年的表现,质量四分位数每降低一级,死亡率就会增加(第一四分位数,3.2%;第二四分位数,4.0%;第三四分位数,5.1%;第四四分位数,6.0%)。最差四分位数的开放性修复医院的死亡风险明显高于最佳四分位数医院(优势比,2.01;95%置信区间,1.07 - 3.76;P =.03)。使用2016年至2019年的DAA来定义质量,质量最高四分位数的开放性修复医院的DAA中位数低于质量最低四分位数医院(-1.18 DAA对 +1.32 DAA;P <.001),这与较低的可靠性调整后死亡率中位数相关(3.6%对5.1%;P <.001)。
在衡量医院层面的结果时,调整医院数量差异至关重要。获得性结果能够准确地对医院质量进行分类,并与可靠性调整相关,但更易于计算和解释。从2016年到2019年,与平均水平的医院相比,质量最高的开放性AAA修复医院预防了超过40例围手术期死亡,与质量最低的医院相比,预防了超过80例围手术期死亡。