South Western Sydney Clinical School, University of New South Wales, Sydney, NSW, Australia.
South Western Sydney Clinical School, University of New South Wales, Sydney, NSW, Australia; Department of Cardiothoracic Surgery, Liverpool Hospital, Sydney, NSW, Australia.
Heart Lung Circ. 2022 Apr;31(4):590-601. doi: 10.1016/j.hlc.2021.08.026. Epub 2021 Oct 28.
Risk scoring models (RSMs) are commonly used for estimation of postoperative-mortality risk in patients undergoing cardiac surgery, but their prediction accuracy may vary in different populations and clinical situations. The prognostic accuracies of some RSMs have not yet been fully evaluated in the Australian population. In this retrospective observational study, our aims were to assess the performance of four contemporary RSMs, to identify the best RSMs for prediction of postoperative-mortality in the single-centre cohort, and to determine a statistical threshold for classification of patients with increased or "higher" mortality risk.
The study population included patients who underwent cardiac surgery at Liverpool Hospital between January 2013 and December 2014. Demographic information was collected, and mortality risks were estimated with the ES2 (EuroSCORE II), STS (Society of Thoracic Surgeons Score), AS (AusSCORE total) and ASMR (AusSCORE multi-risk) RSMs. (Additive EuroSCORE) (AES) and LES (logistic EuroSCORE) were included for historical interest. Discrimination, the ability to stratify patients between mortality and no mortality outcomes, and calibration, the comparison of risk score estimated and observed outcome in the population, were evaluated for each RSM, to determine their predictive accuracy in the study population. Discrimination was assessed by the AUC (area under the receiver operating characteristic curve), and acceptable calibration by the p-value greater than 0.05 for the Hosmer-Lemeshow (H-L) test. The best AUCs in contempory models were compared using the DeLong test. For ES2 and STS risk scores, cut-off points, or thresholds, for patients at increased risk of mortality were derived using Youden's J-statistics, calculated from sensitivity and specificity of models in predicting mortality.
From a total study population of 898 patients, 738 had scores for all six RSMs. The three EuroSCORE risk models and Youden's J-statistics analysis included the total population. Of the models in contemporary use, ES2 had higher discrimination (AUC=0.850) in this population than ASMR (AUC=0.767, p=0.024) and AS (AUC=0.739) and non-significantly higher discrimination than STS (AUC=0.806, p=0.19). All contemporary models had acceptable calibration but the older LES (H-L p=0.024) did not. Estimated mortality was closest to observed mortality with the ES2 model. Both AES and LES over predicted mortality. The RSM with the highest discrimination in isolated coronary artery bypass graft surgery (CAGs) (AUC=0.847), isolated valves (AUC=0.830), and females (AUC=0.784) was the ES2 model. STS discrimination was highest in CAGs plus valve procedures (AUC 0.891), and males (STS AUC=0.891). Cut-off points for risk scores to define increased risk populations were 3.0% for ES2 and 1.7% for STS. Similar proportions of patients in each RSM (ES2-26% to STS-32%) were defined as higher risk by the model threshold score depending on type of procedure.
Among RSMs in contemporary use, ES2 and STS showed the best discrimination and acceptable calibration. Caution is recommended in specific subgroups. Increased mortality risk score cut-off points could be identified for these two RSMs in this single-centre cohort.
风险评分模型(RSM)常用于评估接受心脏手术的患者术后死亡风险,但在不同人群和临床情况下,其预测准确性可能存在差异。一些 RSM 的预后准确性尚未在澳大利亚人群中得到充分评估。在这项回顾性观察性研究中,我们的目的是评估四种当代 RSM 的表现,确定在单中心队列中预测术后死亡率的最佳 RSM,并确定用于分类死亡率增加或“更高”风险患者的统计阈值。
研究人群包括 2013 年 1 月至 2014 年 12 月在利物浦医院接受心脏手术的患者。收集人口统计学信息,并使用 ES2(欧洲心脏手术风险评分 II)、STS(胸外科医师学会评分)、AS(澳大利亚评分总和)和 ASMR(澳大利亚评分多风险)RSM 估计死亡率风险。(加性欧洲心脏手术评分)(AES)和 LES(逻辑欧洲心脏手术评分)被纳入用于历史参考。评估了每种 RSM 的区分能力(区分死亡和非死亡结局的能力)和校准能力(在人群中比较风险评分估计和观察结果的能力),以确定它们在研究人群中的预测准确性。区分能力通过 AUC(接受者操作特征曲线下面积)评估,校准能力通过 Hosmer-Lemeshow(H-L)检验的 p 值大于 0.05 来评估。使用 DeLong 检验比较当代模型中最佳 AUC。对于 ES2 和 STS 风险评分,使用 Youden 的 J 统计量从模型预测死亡率的灵敏度和特异性中得出,为具有更高死亡率风险的患者确定了风险评分的切点或阈值。
在总共 898 名患者的研究人群中,738 名患者具有所有六种 RSM 的评分。三种欧洲心脏手术评分风险模型和 Youden 的 J 统计量分析包括总人群。在当代使用的模型中,ES2 在该人群中的区分能力(AUC=0.850)高于 ASMR(AUC=0.767,p=0.024)和 AS(AUC=0.739),与 STS(AUC=0.806,p=0.19)的区分能力非显著更高。所有当代模型的校准能力都可以接受,但较旧的 LES(H-L p=0.024)则不然。ES2 模型对估计的死亡率与观察到的死亡率最接近。AES 和 LES 均过度预测了死亡率。在孤立性冠状动脉旁路移植术(CAG)(AUC=0.847)、孤立性瓣膜(AUC=0.830)和女性(AUC=0.784)中,区分能力最高的 RSM 是 ES2 模型。STS 在 CAG 加瓣膜手术(AUC 0.891)和男性(STS AUC=0.891)中的区分能力最高。定义风险评分以确定高危人群的切点为 ES2 为 3.0%,STS 为 1.7%。根据手术类型,每种 RSM(ES2-26%至 STS-32%)中都有相似比例的患者被模型阈值评分定义为高危。
在当代使用的 RSM 中,ES2 和 STS 表现出最佳的区分能力和可接受的校准能力。在特定亚组中应谨慎使用。可以为这个单中心队列中的这两种 RSM 确定增加的死亡率风险评分切点。