Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK.
Weill Cornell Medical College, Cornell University, New York, USA.
Interact Cardiovasc Thorac Surg. 2021 Oct 29;33(5):673-686. doi: 10.1093/icvts/ivab151.
The most used mortality risk prediction models in cardiac surgery are the European System for Cardiac Operative Risk Evaluation (ES) and Society of Thoracic Surgeons (STS) score. There is no agreement on which score should be considered more accurate nor which score should be utilized in each population subgroup. We sought to provide a thorough quantitative assessment of these 2 models.
We performed a systematic literature review and captured information on discrimination, as quantified by the area under the receiver operator curve (AUC), and calibration, as quantified by the ratio of observed-to-expected mortality (O:E). We performed random effects meta-analysis of the performance of the individual models as well as pairwise comparisons and subgroup analysis by procedure type, time and continent.
The ES2 {AUC 0.783 [95% confidence interval (CI) 0.765-0.800]; O:E 1.102 (95% CI 0.943-1.289)} and STS [AUC 0.757 (95% CI 0.727-0.785); O:E 1.111 (95% CI 0.853-1.447)] showed good overall discrimination and calibration. There was no significant difference in the discrimination of the 2 models (difference in AUC -0.016; 95% CI -0.034 to -0.002; P = 0.09). However, the calibration of ES2 showed significant geographical variations (P < 0.001) and a trend towards miscalibration with time (P=0.057). This was not seen with STS.
ES2 and STS are reliable predictors of short-term mortality following adult cardiac surgery in the populations from which they were derived. STS may have broader applications when comparing outcomes across continents as compared to ES2.
Prospero (https://www.crd.york.ac.uk/PROSPERO/) CRD42020220983.
心脏外科最常用的死亡率风险预测模型是欧洲心脏手术风险评估系统(ES)和胸外科医生协会(STS)评分。对于哪种评分更准确,以及哪种评分应适用于每个亚组人群,尚无共识。我们旨在对这两种模型进行全面的定量评估。
我们进行了系统的文献回顾,并收集了关于区分度的信息,其量化指标为接受者操作特征曲线下的面积(AUC),以及校准度,其量化指标为实际死亡率与预期死亡率之比(O:E)。我们对个体模型的性能进行了随机效应荟萃分析,并进行了两两比较和按手术类型、时间和大陆的亚组分析。
ES2(AUC 0.783 [95%置信区间(CI)0.765-0.800];O:E 1.102(95% CI 0.943-1.289))和 STS(AUC 0.757 [95% CI 0.727-0.785];O:E 1.111 [95% CI 0.853-1.447])的总体区分度和校准度均较好。两种模型的区分度没有显著差异(AUC 差值为 0.016;95% CI -0.034 至 -0.002;P = 0.09)。然而,ES2 的校准度存在显著的地域差异(P < 0.001),且随着时间推移呈现出校准不足的趋势(P=0.057)。STS 则没有出现这种情况。
ES2 和 STS 是从其衍生的人群中预测成人心脏手术后短期死亡率的可靠预测指标。STS 可能比 ES2 更适用于比较不同大陆的结果。
PROSPERO(https://www.crd.york.ac.uk/PROSPERO/)CRD42020220983。