From the, Department of Emergency Medicine, University of Ottawa, Ottawa, Ontario, Canada.
and the, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada.
Acad Emerg Med. 2021 Feb;28(2):184-196. doi: 10.1111/acem.14190. Epub 2021 Jan 2.
Having shown promise in other medical fields, we sought to determine whether machine learning (ML) models perform better than usual care in diagnostic and prognostic prediction for emergency department (ED) patients.
In this systematic review, we searched MEDLINE, Embase, Central, and CINAHL from inception to October 17, 2019. We included studies comparing diagnostic and prognostic prediction of ED patients by ML models to usual care methods (triage-based scores, clinical prediction tools, clinician judgment) using predictor variables readily available to ED clinicians. We extracted commonly reported performance metrics of model discrimination and classification. We used the PROBAST tool for risk of bias assessment (PROSPERO registration: CRD42020158129).
The search yielded 1,656 unique records, of which 23 studies involving 16,274,647 patients were included. In all seven diagnostic studies, ML models outperformed usual care in all performance metrics. In six studies assessing in-hospital mortality, the best-performing ML models had better discrimination (area under the receiver operating characteristic curve [AUROC] =0.74-0.94) than any clinical decision tool (AUROC =0.68-0.81). In four studies assessing hospitalization, ML models had better discrimination (AUROC =0.80-0.83) than triage-based scores (AUROC =0.68-0.82). Clinical heterogeneity precluded meta-analysis. Most studies had high risk of bias due to lack of external validation, low event rates, and insufficient reporting of calibration.
Our review suggests that ML may have better prediction performance than usual care for ED patients with a variety of clinical presentations and outcomes. However, prediction model reporting guidelines should be followed to provide clinically applicable data. Interventional trials are needed to assess the impact of ML models on patient-centered outcomes.
机器学习(ML)在其他医学领域表现出良好的效果,我们旨在确定 ML 模型在急诊科(ED)患者的诊断和预后预测方面是否优于常规护理。
在这项系统评价中,我们检索了 MEDLINE、Embase、CENTRAL 和 CINAHL,检索时间从建库至 2019 年 10 月 17 日。我们纳入了比较 ML 模型与常规护理方法(基于分诊的评分、临床预测工具、临床医生判断)对 ED 患者进行诊断和预后预测的研究,这些研究使用 ED 临床医生易于获得的预测变量。我们提取了模型区分度和分类性能的常见报告指标。我们使用 PROBAST 工具进行偏倚风险评估(PROSPERO 注册:CRD42020158129)。
检索得到 1656 条记录,其中纳入 23 项研究,共涉及 16274647 例患者。在所有 7 项诊断研究中,ML 模型在所有性能指标上均优于常规护理。在 6 项评估院内死亡率的研究中,表现最好的 ML 模型的区分度(接受者操作特征曲线下面积 [AUROC]为 0.74-0.94)优于任何临床决策工具(AUROC 为 0.68-0.81)。在 4 项评估住院的研究中,ML 模型的区分度(AUROC 为 0.80-0.83)优于基于分诊的评分(AUROC 为 0.68-0.82)。由于缺乏外部验证、低事件率和校准信息报告不足,大多数研究存在临床异质性。
我们的综述表明,对于具有多种临床表现和结局的 ED 患者,ML 可能比常规护理具有更好的预测性能。然而,为了提供临床适用的数据,应该遵循预测模型报告指南。需要开展干预性试验来评估 ML 模型对以患者为中心的结局的影响。