Minne Lilian, Abu-Hanna Ameen, de Jonge Evert
Department of Medical Informatics, Academic Medical Center, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands.
Crit Care. 2008;12(6):R161. doi: 10.1186/cc7160. Epub 2008 Dec 17.
To systematically review studies evaluating the performance of Sequential Organ Failure Assessment (SOFA)-based models for predicting mortality in patients in the intensive care unit (ICU).
Medline, EMBASE and other databases were searched for English-language articles with the major objective of evaluating the prognostic performance of SOFA-based models in predicting mortality in surgical and/or medical ICU admissions. The quality of each study was assessed based on a quality framework for prognostic models.
Eighteen articles met all inclusion criteria. The studies differed widely in the SOFA derivatives used and in their methods of evaluation. Ten studies reported about developing a probabilistic prognostic model, only five of which used an independent validation data set. The other studies used the SOFA-based score directly to discriminate between survivors and non-survivors without fitting a probabilistic model. In five of the six studies, admission-based models (Acute Physiology and Chronic Health Evaluation (APACHE) II/III) were reported to have a slightly better discrimination ability than SOFA-based models at admission (the receiver operating characteristic curve (AUC) of SOFA-based models ranged between 0.61 and 0.88), and in one study a SOFA model had higher AUC than the Simplified Acute Physiology Score (SAPS) II model. Four of these studies used the Hosmer-Lemeshow tests for calibration, none of which reported a lack of fit for the SOFA models. Models based on sequential SOFA scores were described in 11 studies including maximum SOFA scores and maximum sum of individual components of the SOFA score (AUC range: 0.69 to 0.92) and delta SOFA (AUC range: 0.51 to 0.83). Studies comparing SOFA with other organ failure scores did not consistently show superiority of one scoring system to another. Four studies combined SOFA-based derivatives with admission severity of illness scores, and they all reported on improved predictions for the combination. Quality of studies ranged from 11.5 to 19.5 points on a 20-point scale.
Models based on SOFA scores at admission had only slightly worse performance than APACHE II/III and were competitive with SAPS II models in predicting mortality in patients in the general medical and/or surgical ICU. Models with sequential SOFA scores seem to have a comparable performance with other organ failure scores. The combination of sequential SOFA derivatives with APACHE II/III and SAPS II models clearly improved prognostic performance of either model alone. Due to the heterogeneity of the studies, it is impossible to draw general conclusions on the optimal mathematical model and optimal derivatives of SOFA scores. Future studies should use a standard evaluation methodology with a standard set of outcome measures covering discrimination, calibration and accuracy.
系统评价评估基于序贯器官衰竭评估(SOFA)模型预测重症监护病房(ICU)患者死亡率效能的研究。
检索Medline、EMBASE及其他数据库,查找以评估基于SOFA模型预测外科和/或内科ICU入院患者死亡率的预后效能为主要目的的英文文章。根据预后模型质量框架评估每项研究的质量。
18篇文章符合所有纳入标准。这些研究在使用的SOFA衍生指标及其评估方法上差异很大。10项研究报告了开发概率性预后模型,其中只有5项使用了独立验证数据集。其他研究直接使用基于SOFA的评分来区分存活者和非存活者,而未拟合概率模型。在六项研究中的五项中,据报告基于入院时情况的模型(急性生理与慢性健康状况评估(APACHE)II/III)在入院时的区分能力略优于基于SOFA的模型(基于SOFA的模型的受试者工作特征曲线(AUC)范围在0.61至0.88之间),并且在一项研究中,SOFA模型的AUC高于简化急性生理学评分(SAPS)II模型。其中四项研究使用Hosmer-Lemeshow检验进行校准,均未报告SOFA模型拟合不佳。11项研究描述了基于序贯SOFA评分的模型,包括最大SOFA评分和SOFA评分各单项成分的最大总和(AUC范围:0.69至0.92)以及SOFA变化值(AUC范围:0.51至0.83)。比较SOFA与其他器官衰竭评分的研究并未始终显示出一种评分系统优于另一种。四项研究将基于SOFA的衍生指标与入院时疾病严重程度评分相结合,并且均报告联合使用能改善预测效果。研究质量在20分制中为11.5至19.5分。
基于入院时SOFA评分的模型在预测普通内科和/或外科ICU患者死亡率方面的表现仅略逊于APACHE II/III,并且与SAPS II模型具有竞争力。基于序贯SOFA评分的模型似乎与其他器官衰竭评分具有相当的性能。序贯SOFA衍生指标与APACHE II/III和SAPS II模型相结合明显改善了任一模型单独使用时的预后效能。由于研究的异质性,无法就SOFA评分的最佳数学模型和最佳衍生指标得出一般性结论。未来的研究应使用标准评估方法以及一套涵盖区分能力、校准和准确性的标准结局指标。