Department of Epidemiology,Johns Hopkins Bloomberg School of Public Health,Baltimore,Maryland.
Department of Epidemiology and Public Health,University of Maryland School of Medicine,Baltimore,Maryland.
Infect Control Hosp Epidemiol. 2019 Apr;40(4):400-407. doi: 10.1017/ice.2019.17. Epub 2019 Mar 4.
Timely identification of multidrug-resistant gram-negative infections remains an epidemiological challenge. Statistical models for predicting drug resistance can offer utility where rapid diagnostics are unavailable or resource-impractical. Logistic regression-derived risk scores are common in the healthcare epidemiology literature. Machine learning-derived decision trees are an alternative approach for developing decision support tools. Our group previously reported on a decision tree for predicting ESBL bloodstream infections. Our objective in the current study was to develop a risk score from the same ESBL dataset to compare these 2 methods and to offer general guiding principles for using each approach.
Using a dataset of 1,288 patients with Escherichia coli or Klebsiella spp bacteremia, we generated a risk score to predict the likelihood that a bacteremic patient was infected with an ESBL-producer. We evaluated discrimination (original and cross-validated models) using receiver operating characteristic curves and C statistics. We compared risk score and decision tree performance, and we reviewed their practical and methodological attributes.
In total, 194 patients (15%) were infected with ESBL-producing bacteremia. The clinical risk score included 14 variables, compared to the 5 decision-tree variables. The positive and negative predictive values of the risk score and decision tree were similar (>90%), but the C statistic of the risk score (0.87) was 10% higher.
A decision tree and risk score performed similarly for predicting ESBL infection. The decision tree was more user-friendly, with fewer variables for the end user, whereas the risk score offered higher discrimination and greater flexibility for adjusting sensitivity and specificity.
及时识别耐多药革兰氏阴性感染仍然是一个流行病学挑战。预测耐药性的统计模型在快速诊断不可用或资源不切实际的情况下具有实用性。逻辑回归衍生的风险评分在医疗保健流行病学文献中很常见。机器学习衍生的决策树是开发决策支持工具的另一种方法。我们的小组之前报告了一种用于预测 ESBL 血流感染的决策树。我们在当前研究中的目标是从同一 ESBL 数据集开发一个风险评分,以比较这两种方法,并为每种方法的使用提供一般指导原则。
我们使用了 1288 名患有大肠埃希菌或肺炎克雷伯菌菌血症的患者数据集,生成了一个风险评分来预测菌血症患者感染 ESBL 生产者的可能性。我们使用接收者操作特征曲线和 C 统计数据评估了区分度(原始和交叉验证模型)。我们比较了风险评分和决策树的性能,并审查了它们的实际和方法学属性。
共有 194 名患者(15%)感染了 ESBL 产菌血症。临床风险评分包括 14 个变量,而决策树有 5 个变量。风险评分和决策树的阳性和阴性预测值相似(>90%),但风险评分的 C 统计量(0.87)高 10%。
决策树和风险评分在预测 ESBL 感染方面表现相似。决策树对最终用户来说更方便,变量更少,而风险评分提供了更高的辨别力和更大的灵活性,可用于调整敏感性和特异性。