Ellis S G, Omoigui N, Bittl J A, Lincoff M, Wolfe M W, Howell G, Topol E J
Department of Cardiology, Cleveland Clinic Foundation, OH 44195, USA.
Circulation. 1996 Feb 1;93(3):431-9. doi: 10.1161/01.cir.93.3.431.
Medical consumers are increasingly requesting methods to discriminate among the results of different providers. Standards for appropriate modeling, risk adjustment, and evaluation ("scorecarding") in this setting are not well developed, although such evaluation is being performed by the medical insurance industry and by several states in the United States. Our objectives were to develop and examine clinically meaningful methodology for assessing the operator-specific results for percutaneous coronary revascularization.
From a multicenter database of patients treated since January 1, 1990, we used training and validation samples (n = 4860) to develop several models for risk adjustment and applied them to 38 providers performing 25 to 523 procedures in the database. Models were developed using multivariable logistic regression techniques for combinations of the end points of death, myocardial infarction, bypass surgery, and procedural success. Models were evaluated for predictive accuracy by using receiver operating characteristic (ROC) analysis, for the capacity to discriminate between superior and inferior provider outcomes, and for subjectivity and concordance. Major complications occurred in 3.6% of patients. The area under the ROC curve (with perfect discriminatory accuracy, area = 1.0; with no apparent accuracy, area = 0.5) in the validation sample, and frequency of identification of operators with outcomes outside the 95% CI for the outcome in question for the models were for death, 0.85 and 7.9%; for death, Q-wave infarction, and bypass surgery, 0.77 and 13.2%; for death, all infarction, and bypass surgery, 0.66 and 10.5%; and for procedural success, 0.76 and 23.7%. For the models as a group, identification of outliers was inversely related to provider volume (P = .05). Models evaluating non-Q-wave infarction or requiring measurement of percent diameter stenosis were identified as being most susceptible to provider manipulation.
For percutaneous coronary revascularization, modeling to discriminate between provider outcomes is limited by the low incidence of major adverse events, subjectivity or susceptibility to manipulation of more frequently occurring adverse events, the generally modest predictive capacity of the models, and the low volume of individual provider treatments. Modeling will be most useful in the identification of providers with extremely poor outcomes and for discrimination between providers with very large procedural volume. Until improved understanding of the biological and mechanical correlates of major complications allows the development of more predictive models, interpretation of the results of scorecarding, particularly for low-volume providers, should be made with caution.
医疗消费者越来越多地要求获得区分不同医疗服务提供者治疗结果的方法。尽管美国的医疗保险行业和几个州正在开展此类评估,但针对这种情况下合适的建模、风险调整和评估(“计分卡”)标准尚未得到充分完善。我们的目标是开发并检验用于评估经皮冠状动脉血运重建术特定操作者治疗结果的具有临床意义的方法。
从1990年1月1日起接受治疗的患者的多中心数据库中,我们使用训练样本和验证样本(n = 4860)来开发多个风险调整模型,并将其应用于数据库中进行25至523例手术的38个医疗服务提供者。使用多变量逻辑回归技术针对死亡、心肌梗死、搭桥手术和手术成功等终点的组合开发模型。通过使用受试者工作特征(ROC)分析评估模型的预测准确性,评估区分优、劣医疗服务提供者治疗结果的能力以及主观性和一致性。3.6%的患者发生了主要并发症。验证样本中ROC曲线下面积(完美区分准确性时,面积 = 1.0;无明显准确性时,面积 = 0.5)以及模型对治疗结果处于所关注结果的95%置信区间之外的操作者的识别频率分别为:死亡为0.85和7.9%;死亡、Q波心肌梗死和搭桥手术为0.77和13.2%;死亡、所有心肌梗死和搭桥手术为0.66和10.5%;手术成功为0.76和23.7%。对于作为一个整体的模型,异常值的识别与医疗服务提供者的手术量呈负相关(P = .05)。评估非Q波心肌梗死或需要测量直径狭窄百分比的模型被确定为最容易受到医疗服务提供者的操纵。
对于经皮冠状动脉血运重建术,区分医疗服务提供者治疗结果的建模受到主要不良事件发生率低、主观性或对更频繁发生的不良事件的易操纵性、模型总体预测能力一般以及单个医疗服务提供者治疗量低的限制。建模在识别治疗结果极差的医疗服务提供者以及区分手术量非常大的医疗服务提供者方面将最有用。在对主要并发症的生物学和机械相关性有更好的理解从而能够开发出更具预测性的模型之前,对计分卡结果的解释,尤其是对于手术量低的医疗服务提供者,应谨慎进行。