Surgical Outcomes and Quality Improvement Center, Department of Surgery, Northwestern University, Chicago, IL.
Surgical Outcomes and Quality Improvement Center, Department of Surgery, Northwestern University, Chicago, IL; Division of Optimal Research and Patient Care, American College of Surgeons, Chicago, IL.
J Am Coll Surg. 2018 Sep;227(3):303-312.e3. doi: 10.1016/j.jamcollsurg.2018.06.002. Epub 2018 Jun 27.
Surgeon performance profiling is of great interest to surgeons, hospitals, health plans, and the public, yet efforts to date have been contested, with stakeholders at odds over the selection, reliability, and validity of metrics used. We sought to create surgeon-level comparative assessments within the Illinois Surgical Quality Improvement Collaborative.
American College of Surgeons NSQIP data were obtained for 51 Illinois hospitals covering a 30-month period from 2014 to 2016. Surgeon-level, risk-adjusted outcomes rates were estimated from 3-level crossed random effects logistic regression models and classified as low, as expected, or high for each of 7 postoperative outcomes. Model intra-class correlations and provider-specific reliability statistics were calculated.
A total of 123,141 cases were analyzed for 2,724 surgeons. Median provider case volume was 17 (interquartile range 4 to 54). Overall crude complication rates ranged from 0.62% to 7.14% across the 7 outcomes investigated. Surgeon-level variance estimates were low (intra-class correlation coefficients between 0.007 and 0.074). No performance outliers were detected for 3 of the outcomes measures, while a small number of outliers were identified for any morbidity (11 surgeons), surgical site infection (10 surgeons), death or serious morbidity (8 surgeons), and reoperation (1 surgeon). Among all physicians, median reliability was below 0.1 for each outcome.
Few individual surgeon performance outliers could be detected in NSQIP clinical registry data for a statewide hospital collaborative over a 30-month period using postoperative patient outcomes. Low surgeon-specific case volumes and minimal variance between surgeons may limit the utility of American College of Surgeons NSQIP outcomes measures for individual profiling. Alternative metrics, such as process measures, patient experience, composite measures, or technical skill assessments should be explored for surgeon-level measurement.
外科医生的绩效评估受到外科医生、医院、医疗计划和公众的极大关注,但迄今为止的努力一直存在争议,利益相关者在使用的指标的选择、可靠性和有效性上存在分歧。我们试图在伊利诺伊州外科质量改进协作组织内创建外科医生层面的比较评估。
我们从 2014 年至 2016 年的 30 个月期间,获得了伊利诺伊州 51 家医院的美国外科医师学会国家外科质量改进计划(NSQIP)数据。使用 3 级交叉随机效应逻辑回归模型从外科医生层面估计风险调整后的结果率,并对 7 种术后结果中的每一种结果进行低、预期或高的分类。计算了模型组内相关系数和提供者特定可靠性统计数据。
对 2724 名外科医生的 123141 例病例进行了分析。中位数提供者的病例量为 17 例(四分位距为 4 至 54 例)。在调查的 7 种结果中,整体粗并发症发生率从 0.62%到 7.14%不等。外科医生层面的方差估计值较低(组内相关系数在 0.007 到 0.074 之间)。对于 3 种结果指标,没有检测到任何表现异常的医生,而对于任何发病率(11 名外科医生)、手术部位感染(10 名外科医生)、死亡或严重发病率(8 名外科医生)和再次手术(1 名外科医生),则发现了少数异常值。在所有医生中,每个结果的中位数可靠性均低于 0.1。
在使用术后患者结果的情况下,在全州医院协作组织中,在 30 个月的时间内,使用 NSQIP 临床注册数据可以检测到少数个体外科医生绩效异常。外科医生特定的低病例量和外科医生之间的最小差异可能限制了美国外科医师学会 NSQIP 结果指标在个体评估中的效用。应探索替代指标,如过程指标、患者体验、综合指标或技术技能评估,用于外科医生层面的测量。