*Department of Surgery, University of Michigan Health System, Ann Arbor, MI †Department of Economics, Dartmouth College, Hanover, NH.
Med Care. 2014 Jun;52(6):565-71. doi: 10.1097/MLR.0000000000000138.
Because of small sample sizes and low event rates, risk-adjusted surgical outcomes often do not meet reliability benchmarks for distinguishing hospital performance. Nonetheless, it is unclear whether these measures may still be useful for predicting future hospital surgical performance.
We used national Medicare data to analyze patients undergoing colectomy from 2007 to 2010 (n=462,959 patients). We first quantified 2007-2008 outcome reliability (ability to differentiate quality differences) and ranked hospitals based on their 2007-2008 risk-adjusted outcome rates. To assess the ability of adjusted outcomes to predict true performance, we evaluated future (2009-2010) outcomes across quintiles of past performance. We then systematically sampled 2007-2008 cases to evaluate performance prediction when hospitals' past performance was measured with progressively lower reliability levels.
Outcomes in 2007-2008 were good predictors of outcomes in the next 2 years (2009-2010), but predictive strength depended upon reliability. With progressive sampling of 2007-2008 caseloads, outcome reliability and predictive strength decreased. With 100% sampling of 2007-2008 caseloads, the worst versus best hospital quintile based on past performance had 1.52 [95% confidence interval (CI), 1.44-1.60] times the odds of mortality and 1.50 (95% CI, 1.44-1.56) times the odds of complications in 2009-2010. With 10% sampling, outcome reliability was well below commonly accepted benchmarks, but the worst quintile of hospitals in 2007-2008 still had 1.12 (95% CI, 1.06-1.19) times the odds of mortality and 1.16 (95% CI, 1.11-1.21) times the odds of complications in 2009-2010 compared with the best quintile of hospitals.
Even at very low reliability levels, risk-adjusted outcome measures may distinguish best and worst hospitals' surgical performance. This study suggests that commonly accepted reliability thresholds may be too high, especially in the context of selective referral.
由于样本量小和事件发生率低,风险调整后的手术结果通常不符合区分医院绩效的可靠性标准。尽管如此,这些措施是否仍然可用于预测未来医院的手术绩效尚不清楚。
我们使用国家医疗保险数据对 2007 年至 2010 年期间接受结肠切除术的患者(n=462959 名患者)进行了分析。我们首先量化了 2007-2008 年的结果可靠性(区分质量差异的能力),并根据其 2007-2008 年风险调整后的结果率对医院进行了排名。为了评估调整后的结果对真实绩效的预测能力,我们在过去绩效的五个五分位数中评估了未来(2009-2010 年)的结果。然后,我们系统地对 2007-2008 年的病例进行抽样,以评估当医院过去的绩效是用逐渐降低的可靠性水平来衡量时,对绩效的预测能力。
2007-2008 年的结果是对未来 2 年(2009-2010 年)结果的良好预测,但预测强度取决于可靠性。随着 2007-2008 年病例量的逐步抽样,结果的可靠性和预测强度降低。当对 2007-2008 年病例量进行 100%抽样时,过去绩效最差与最好的五分之一医院相比,2009-2010 年的死亡率和并发症发生率分别高出 1.52 倍(95%置信区间(CI),1.44-1.60)和 1.50 倍(95%CI,1.44-1.56)。当抽样率为 10%时,结果的可靠性远低于普遍接受的基准,但在 2007-2008 年,最差的五分之一医院的死亡率和并发症发生率仍然比最好的五分之一医院分别高出 1.12 倍(95%CI,1.06-1.19)和 1.16 倍(95%CI,1.11-1.21)。
即使在可靠性水平非常低的情况下,风险调整后的结果衡量标准也可能区分最佳和最差医院的手术绩效。本研究表明,普遍接受的可靠性阈值可能过高,尤其是在选择性转诊的情况下。