Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
Critical Care Division, University College London Hospitals NHS Trust, London, UK.
BMJ. 2020 May 20;369:m1501. doi: 10.1136/bmj.m1501.
To provide an overview and critical appraisal of early warning scores for adult hospital patients.
Systematic review.
Medline, CINAHL, PsycInfo, and Embase until June 2019.
Studies describing the development or external validation of an early warning score for adult hospital inpatients.
13 171 references were screened and 95 articles were included in the review. 11 studies were development only, 23 were development and external validation, and 61 were external validation only. Most early warning scores were developed for use in the United States (n=13/34, 38%) and the United Kingdom (n=10/34, 29%). Death was the most frequent prediction outcome for development studies (n=10/23, 44%) and validation studies (n=66/84, 79%), with different time horizons (the most frequent was 24 hours). The most common predictors were respiratory rate (n=30/34, 88%), heart rate (n=28/34, 83%), oxygen saturation, temperature, and systolic blood pressure (all n=24/34, 71%). Age (n=13/34, 38%) and sex (n=3/34, 9%) were less frequently included. Key details of the analysis populations were often not reported in development studies (n=12/29, 41%) or validation studies (n=33/84, 39%). Small sample sizes and insufficient numbers of event patients were common in model development and external validation studies. Missing data were often discarded, with just one study using multiple imputation. Only nine of the early warning scores that were developed were presented in sufficient detail to allow individualised risk prediction. Internal validation was carried out in 19 studies, but recommended approaches such as bootstrapping or cross validation were rarely used (n=4/19, 22%). Model performance was frequently assessed using discrimination (development n=18/22, 82%; validation n=69/84, 82%), while calibration was seldom assessed (validation n=13/84, 15%). All included studies were rated at high risk of bias.
Early warning scores are widely used prediction models that are often mandated in daily clinical practice to identify early clinical deterioration in hospital patients. However, many early warning scores in clinical use were found to have methodological weaknesses. Early warning scores might not perform as well as expected and therefore they could have a detrimental effect on patient care. Future work should focus on following recommended approaches for developing and evaluating early warning scores, and investigating the impact and safety of using these scores in clinical practice.
PROSPERO CRD42017053324.
提供成人住院患者预警评分的概述和批判性评价。
系统评价。
截至 2019 年 6 月,在 Medline、CINAHL、PsycInfo 和 Embase 上进行检索。
描述成人住院患者预警评分开发或外部验证的研究。
共筛选出 13171 篇参考文献,有 95 篇文章纳入综述。11 项研究仅为开发研究,23 项为开发和外部验证研究,61 项为外部验证研究。大多数预警评分是在美国(n=13/34,38%)和英国(n=10/34,29%)开发的。开发研究(n=10/23,44%)和验证研究(n=66/84,79%)中最常见的预测结果是死亡,且具有不同的时间范围(最常见的是 24 小时)。最常见的预测因素是呼吸频率(n=30/34,88%)、心率(n=28/34,83%)、氧饱和度、体温和收缩压(n=24/34,71%)。年龄(n=13/34,38%)和性别(n=3/34,9%)较少被纳入。开发研究(n=12/29,41%)或验证研究(n=33/84,39%)中经常未报告分析人群的关键细节。在模型开发和外部验证研究中,小样本量和事件患者数量不足很常见。数据缺失通常被丢弃,只有一项研究使用了多重插补。只有 9 个开发的预警评分详细到足以进行个体化风险预测。19 项研究进行了内部验证,但很少使用推荐的方法(bootstrap 或交叉验证)(n=4/19,22%)。模型性能经常使用区分度进行评估(开发 n=18/22,82%;验证 n=69/84,82%),而校准很少被评估(验证 n=13/84,15%)。所有纳入的研究均被评为高偏倚风险。
预警评分是广泛使用的预测模型,常用于识别住院患者的早期临床恶化。然而,许多在临床中使用的预警评分存在方法学上的缺陷。预警评分的性能可能不如预期,因此可能对患者护理产生不利影响。未来的研究应集中于遵循开发和评估预警评分的推荐方法,并研究这些评分在临床实践中的影响和安全性。
PROSPERO CRD42017053324。