Pollack M M, Koch M A, Bartel D A, Rapoport I, Dhanireddy R, El-Mohandes A A, Harkavy K, Subramanian K N
Department of Critical Care Medicine, Children's National Medical Center, Washington, DC 20010, USA.
Pediatrics. 2000 May;105(5):1051-7. doi: 10.1542/peds.105.5.1051.
Risk-adjusted severity of illness is frequently used in clinical research and quality assessments. Although there are multiple methods designed for neonates, they have been infrequently compared and some have not been assessed in large samples of very low birth weight (VLBW; <1500 g) infants.
To test and compare published neonatal mortality prediction models, including Clinical Risk Index for Babies (CRIB), Score for Neonatal Acute Physiology (SNAP), SNAP-Perinatal Extension (SNAP-PE), Neonatal Therapeutic Interventions Scoring System, the National Institute of Child Health and Human Development (NICHD) network model, and other individual admission factors such as birth weight, low Apgar score (<7 at 5 minutes), and small for gestational age status in a cohort of VLBW infants from the Washington, DC area.
Data were collected on 476 VLBW infants admitted to 8 neonatal intensive care units between October 1994 and February 1997. The calibration (closeness of total observed deaths to the predicted total) of models with published coefficients (SNAP-PE, CRIB, and NICHD) was assessed using the standardized mortality ratio. Discrimination was quantified as the area under the curve (AUC) for the receiver operating characteristic curves. Calibrated models were derived for the current database using logistic regression techniques. Goodness-of-fit of predicted to observed probabilities of death was assessed with the Hosmer-Lemeshow goodness-of-fit test.
The calibration of published algorithms applied to our data was poor. The standardized mortality ratios for the NICHD, CRIB, and SNAP-PE models were.65,.56, and.82, respectively. Discrimination of all the models was excellent (range:.863-.930). Surprisingly, birth weight performed much better than in previous analyses, with an AUC of.869. The best models using both 12- and 24-hour postadmission data, significantly outperformed the best model based on birth data only but were not significantly different from each other. The variables in the best model were birth weight, birth weight squared, low 5-minute Apgar score, and SNAP (AUC =.930).
Published models for severity of illness overpredicted hospital mortality in this set of VLBW infants, indicating a need for frequent recalibration. Discrimination for these severity of illness scores remains excellent. Birth variables should be reevaluated as a method to control for severity of illness in predicting mortality.
疾病严重程度风险调整常用于临床研究和质量评估。虽然有多种针对新生儿设计的方法,但它们很少被比较,而且一些方法尚未在极低出生体重(VLBW;<1500克)婴儿的大样本中进行评估。
测试和比较已发表的新生儿死亡率预测模型,包括婴儿临床风险指数(CRIB)、新生儿急性生理学评分(SNAP)、SNAP围产期扩展版(SNAP-PE)、新生儿治疗干预评分系统、美国国立儿童健康与人类发展研究所(NICHD)网络模型,以及其他个体入院因素,如出生体重、低阿氏评分(5分钟时<7分)和来自华盛顿特区地区的一组极低出生体重婴儿的小于胎龄状态。
收集了1994年10月至1997年2月期间入住8个新生儿重症监护病房的476例极低出生体重婴儿的数据。使用标准化死亡率评估具有已发表系数(SNAP-PE、CRIB和NICHD)的模型的校准(观察到的总死亡数与预测的总死亡数的接近程度)。辨别力通过受试者工作特征曲线的曲线下面积(AUC)进行量化。使用逻辑回归技术为当前数据库推导校准模型。用Hosmer-Lemeshow拟合优度检验评估预测死亡概率与观察到的死亡概率的拟合优度。
应用于我们数据的已发表算法的校准效果不佳。NICHD、CRIB和SNAP-PE模型的标准化死亡率分别为0.65、0.56和0.82。所有模型的辨别力都非常好(范围:0.863 - 0.930)。令人惊讶的是,出生体重的表现比之前的分析要好得多,AUC为0.869。使用入院后12小时和24小时数据的最佳模型明显优于仅基于出生数据的最佳模型,但彼此之间没有显著差异。最佳模型中的变量是出生体重、出生体重的平方、5分钟低阿氏评分和SNAP(AUC = 0.930)。
已发表的疾病严重程度模型在这组极低出生体重婴儿中高估了医院死亡率,表明需要经常重新校准。这些疾病严重程度评分的辨别力仍然很好。在预测死亡率时,应重新评估出生变量作为控制疾病严重程度的一种方法。