Clermont G, Angus D C, DiRusso S M, Griffin M, Linde-Zwirble W T
Critical Care Medicine Division, the Department of Anesthesiology and Critical Care Medicine, and the Center for Research on Health Care, University of Pittsburgh, Pittsburgh, PA, USA.
Crit Care Med. 2001 Feb;29(2):291-6. doi: 10.1097/00003246-200102000-00012.
Logistic regression (LR), commonly used for hospital mortality prediction, has limitations. Artificial neural networks (ANNs) have been proposed as an alternative. We compared the performance of these approaches by using stepwise reductions in sample size.
Prospective cohort study.
Seven intensive care units (ICU) at one tertiary care center.
Patients were 1,647 ICU admissions for whom first-day Acute Physiology and Chronic Health Evaluation III variables were collected.
None.
We constructed LR and ANN models on a random set of 1,200 admissions (development set) and used the remaining 447 as the validation set. We repeated model construction on progressively smaller development sets (800, 400, and 200 admissions) and retested on the original validation set (n = 447). For each development set, we constructed models from two LR and two ANN architectures, organizing the independent variables differently. With the 1,200-admission development set, all models had good fit and discrimination on the validation set, where fit was assessed by the Hosmer-Lemeshow C statistic (range, 10.6-15.3; p > or = .05) and standardized mortality ratio (SMR) (range, 0.93 [95% confidence interval, 0.79-1.15] to 1.09 [95% confidence interval, 0.89-1.38]), and discrimination was assessed by the area under the receiver operating characteristic curve (range, 0.80-0.84). As development set sample size decreased, model performance on the validation set deteriorated rapidly, although the ANNs retained marginally better fit at 800 (best C statistic was 26.3 [p = .0009] and 13.1 [p = .11] for the LR and ANN models). Below 800, fit was poor with both approaches, with high C statistics (ranging from 22.8 [p <.004] to 633 [p <.0001]) and highly biased SMRs (seven of the eight models below 800 had SMRs of <0.85, with an upper confidence interval of <1). Discrimination ranged from 0.74 to 0.84 below 800.
When sample size is adequate, LR and ANN models have similar performance. However, development sets of < or = 800 were generally inadequate. This is concerning, given typical sample sizes used for individual ICU mortality prediction.
常用于预测医院死亡率的逻辑回归(LR)存在局限性。有人提出将人工神经网络(ANN)作为一种替代方法。我们通过逐步减少样本量来比较这些方法的性能。
前瞻性队列研究。
一家三级医疗中心的七个重症监护病房(ICU)。
1647例入住ICU的患者,收集了其首日急性生理学与慢性健康状况评价III变量。
无。
我们在一组随机抽取的1200例入院患者(开发集)上构建了LR和ANN模型,并将其余447例作为验证集。我们在逐渐变小的开发集(800例、400例和200例入院患者)上重复模型构建,并在原始验证集(n = 447)上重新测试。对于每个开发集,我们从两种LR和两种ANN架构构建模型,以不同方式组织自变量。对于1200例入院患者的开发集,所有模型在验证集上均具有良好的拟合度和区分度,拟合度通过Hosmer-Lemeshow C统计量(范围为10.6 - 1