From the Department of Pediatrics, All India Institute of Medical Sciences.
Department of Pediatrics, All India Institute of Medical Sciences, New Delhi, India.
Pediatr Infect Dis J. 2024 Sep 1;43(9):889-901. doi: 10.1097/INF.0000000000004409. Epub 2024 Jul 26.
Timely diagnosis of neonatal sepsis is challenging. We aimed to systematically evaluate the diagnostic performance of sophisticated machine learning (ML) techniques for the prediction of neonatal sepsis.
We searched MEDLINE, Embase, Web of Science and Cochrane CENTRAL databases using "neonate," "sepsis" and "machine learning" as search terms. We included studies that developed or validated an ML algorithm to predict neonatal sepsis. Those incorporating automated vital-sign data were excluded. Among 5008 records, 74 full-text articles were screened. Two reviewers extracted information as per the CHARMS (CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies) checklist. We followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) guideline extension for diagnostic test accuracy reviews and used the PROBAST tool for risk of bias assessment. Primary outcome was a predictive performance of ML models in terms of sensitivity, specificity and positive and negative predictive values. We generated a hierarchical summary receiver operating characteristics curve for pooled analysis.
Of 19 studies (15,984 participants) with 76 ML models, the random forest algorithm was the most employed. The candidate predictors per model ranged from 5 to 93; most included birth weight and gestation. None performed external validation. The risk of bias was high (18 studies). For the prediction of any sepsis (14 studies), pooled sensitivity was 0.87 (95% credible interval: 0.75-0.94) and specificity was 0.89 (95% credible interval: 0.77-0.95). Pooled area under the receiver operating characteristics curve was 0.94 (95% credible interval: 0.92-0.96). All studies, except one, used data from high- or upper-middle-income countries. With unavailable probability thresholds, the performance could not be assessed with sufficient precision.
ML techniques have good diagnostic accuracy for neonatal sepsis. The need for the development of context-specific models from high-burden countries is highlighted.
及时诊断新生儿败血症具有挑战性。本研究旨在系统评估复杂机器学习(ML)技术预测新生儿败血症的诊断性能。
我们使用“新生儿”、“败血症”和“机器学习”作为检索词,检索 MEDLINE、Embase、Web of Science 和 Cochrane CENTRAL 数据库。我们纳入了开发或验证 ML 算法以预测新生儿败血症的研究。那些纳入自动生命体征数据的研究被排除。在 5008 条记录中,筛选出 74 篇全文文章。两位审查员按照 CHARMS(系统评价和荟萃分析中预测模型研究的关键评估和数据提取清单)清单提取信息。我们遵循 PRISMA(系统评价和荟萃分析的首选报告项目)指南扩展,对诊断测试准确性评估进行了综述,并使用 PROBAST 工具进行偏倚风险评估。主要结局是 ML 模型在敏感性、特异性和阳性及阴性预测值方面的预测性能。我们生成了一个分层汇总受试者工作特征曲线进行汇总分析。
在 19 项研究(15984 名参与者)和 76 个 ML 模型中,随机森林算法的应用最为广泛。每个模型的候选预测因子范围为 5 至 93 个;大多数模型包括出生体重和胎龄。没有研究进行外部验证。偏倚风险高(18 项研究)。对于任何败血症(14 项研究)的预测,汇总敏感性为 0.87(95%可信区间:0.75-0.94),特异性为 0.89(95%可信区间:0.77-0.95)。汇总受试者工作特征曲线下面积为 0.94(95%可信区间:0.92-0.96)。除一项研究外,所有研究均使用来自高收入或中上收入国家的数据。由于没有可用的概率阈值,无法用足够的精度评估性能。
ML 技术对新生儿败血症具有良好的诊断准确性。突出了需要从高负担国家开发特定于背景的模型。