Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA; Service de Biostatistique et Information Médicale, Unité INSERM 1153, Equipe ECSTRA, Hôpital Saint Louis, Paris, France; Service d'Anesthésie-Réanimation, Hôpital Européen Georges Pompidou, Paris, France.
Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA.
Lancet Respir Med. 2015 Jan;3(1):42-52. doi: 10.1016/S2213-2600(14)70239-5. Epub 2014 Nov 24.
Improved mortality prediction for patients in intensive care units is a big challenge. Many severity scores have been proposed, but findings of validation studies have shown that they are not adequately calibrated. The Super ICU Learner Algorithm (SICULA), an ensemble machine learning technique that uses multiple learning algorithms to obtain better prediction performance, does at least as well as the best member of its library. We aimed to assess whether the Super Learner could provide a new mortality prediction algorithm for patients in intensive care units, and to assess its performance compared with other scoring systems.
From January, 2001, to December, 2008, we used the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database (version 26) including all patients admitted to an intensive care unit at the Beth Israel Deaconess Medical Centre, Boston, MA, USA. We assessed the calibration, discrimination, and risk classification of predicted hospital mortality based on Super Learner compared with SAPS-II, APACHE-II, and SOFA. We calculated performance measures with cross-validation to avoid making biased assessments. Our proposed score was then externally validated on a dataset of 200 randomly selected patients admitted at the intensive care unit of Hôpital Européen Georges-Pompidou, Paris, France, between Sept 1, 2013, and June, 30, 2014. The primary outcome was hospital mortality. The explanatory variables were the same as those included in the SAPS II score.
24,508 patients were included, with median SAPS-II of 38 (IQR 27-51) and median SOFA of 5 (IQR 2-8). 3002 of 24,508 (12%) patients died in the Beth Israel Deaconess Medical Centre. We produced two sets of predictions based on the Super Learner; the first based on the 17 variables as they appear in the SAPS-II score (SL1), and the second, on the original, untransformed variables (SL2). The two versions yielded average predicted probabilities of death of 0·12 (IQR 0·02-0·16) and 0·13 (0·01-0·19), whereas the corresponding value for SOFA was 0·12 (0·05-0·15) and for SAPS-II 0·30 (0·08-0·48). The cross-validated area under the receiver operating characteristic curve (AUROC) for SAPS-II was 0·78 (95% CI 0·77-0·78) and 0·71 (0·70-0·72) for SOFA. Super Learner had an AUROC of 0·85 (0·84-0·85) when the explanatory variables were categorised as in SAPS-II, and of 0·88 (0·87-0·89) when the same explanatory variables were included without any transformation. Additionally, Super Learner showed better calibration properties than previous score systems. On the external validation dataset, the AUROC was 0·94 (0·90-0·98) and calibration properties were good.
Compared with conventional severity scores, Super Learner offers improved performance for predicting hospital mortality in patients in intensive care units. A user-friendly implementation is available online and should be useful for clinicians seeking to validate our score.
Fulbright Foundation, Assistance Publique-Hôpitaux de Paris, Doris Duke Clinical Scientist Development Award, and the NIH.
提高重症监护病房患者的死亡率预测是一个巨大的挑战。已经提出了许多严重程度评分,但验证研究的结果表明,它们的校准效果并不理想。Super ICU Learner Algorithm(SICULA)是一种集成机器学习技术,它使用多种学习算法来获得更好的预测性能,其表现至少与库中最好的成员一样出色。我们旨在评估 Super Learner 是否可以为重症监护病房的患者提供新的死亡率预测算法,并评估其与其他评分系统的性能比较。
从 2001 年 1 月到 2008 年 12 月,我们使用了 Beth Israel Deaconess Medical Centre 的 Multiparameter Intelligent Monitoring in Intensive Care II(MIMIC-II)数据库(版本 26),其中包括美国马萨诸塞州波士顿贝斯以色列女执事医疗中心重症监护病房的所有患者。我们评估了基于 Super Learner 的医院死亡率预测的校准、区分和风险分类,并与 SAPS-II、APACHE-II 和 SOFA 进行了比较。我们通过交叉验证计算了性能指标,以避免产生有偏差的评估。然后,我们在法国巴黎欧洲乔治·蓬皮杜医院重症监护病房的 200 名随机选择的患者数据集(2013 年 9 月 1 日至 2014 年 6 月 30 日)上对我们提出的评分进行了外部验证。主要结局是医院死亡率。解释变量与 SAPS II 评分中包含的变量相同。
共纳入 24508 例患者,SAPS-II 中位数为 38(IQR 27-51),SOFA 中位数为 5(IQR 2-8)。24508 例患者中有 3002 例(12%)在贝斯以色列女执事医疗中心死亡。我们基于 Super Learner 生成了两组预测值;第一组基于 SAPS-II 中出现的 17 个变量(SL1),第二组基于原始、未经转换的变量(SL2)。这两种版本产生的死亡平均预测概率分别为 0.12(IQR 0.02-0.16)和 0.13(0.01-0.19),而 SOFA 的对应值为 0.12(0.05-0.15),SAPS-II 的对应值为 0.30(0.08-0.48)。SAPS-II 的交叉验证接受者操作特征曲线下面积(AUROC)为 0.78(95%CI 0.77-0.78),SOFA 为 0.71(0.70-0.72)。当解释变量分类为 SAPS-II 时,Super Learner 的 AUROC 为 0.85(0.84-0.85),当包含相同的解释变量而无需任何转换时,AUROC 为 0.88(0.87-0.89)。此外,Super Learner 显示出比以前的评分系统更好的校准特性。在外部验证数据集上,AUROC 为 0.94(0.90-0.98),校准特性良好。
与传统的严重程度评分相比,Super Learner 为重症监护病房患者的医院死亡率预测提供了更好的性能。一个用户友好的实现可在线获得,应该对寻求验证我们评分的临床医生有用。
富布赖特基金会、巴黎公立医院、Doris Duke 临床科学家发展奖和 NIH。