Plante Timothy B, Blau Aaron M, Berg Adrian N, Weinberg Aaron S, Jun Ik C, Tapson Victor F, Kanigan Tanya S, Adib Artur B
Larner College of Medicine at the University of Vermont, Colchester, VT, United States.
University of Vermont Medical Center, Burlington, VT, United States.
J Med Internet Res. 2020 Dec 2;22(12):e24048. doi: 10.2196/24048.
Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients.
We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments.
Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV).
Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively.
A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing.
采用逆转录聚合酶链反应(RT-PCR)检测(以下简称PCR)对2019冠状病毒病(COVID-19)进行常规诊断,诊断时间较长,且检测成本高昂。严重急性呼吸综合征冠状病毒2(SARS-CoV-2)病毒可能会在广泛使用的常规血液检测结果中产生特征性模式,这些模式可用机器学习方法识别。整合这些常见实验室检测结果的机器学习模式可能会加速排除急诊科患者感染COVID-19的可能性。
我们试图开发(即使用交叉验证技术进行训练和内部验证)并外部验证一个机器学习模型,以仅通过急诊科成人的常规血液检测来排除COVID-19。
利用美国66家医院在疫情前(2019年12月底之前)或疫情期间(2020年3月至7月)急诊科的临床数据,我们纳入了研究时间段内年龄≥20岁的患者。我们排除了实验室结果缺失的患者。模型训练使用了疫情期间43家医院的2183例经PCR确诊的病例;阴性对照是来自同一些医院的10000例疫情前患者。外部验证使用了23家医院的1020例经PCR确诊的病例和171734例疫情前阴性对照。主要结局是使用当日常规实验室结果预测的COVID-19状态。通过受试者操作特征(AUROC)曲线下面积以及敏感性、特异性和阴性预测值(NPV)评估模型性能。
在纳入训练、外部验证和敏感性数据集的192779例患者中(年龄中位数十分位数为50[四分位距30 - 60]岁,40.5%为男性[78249/192779]),训练和外部验证的AUROC为0.91(95%CI 0.90 - 0.92)。在外部验证数据集中使用风险评分临界值1.0(满分100)时,模型的敏感性为95.9%,特异性为41.7%;临界值为2.0时,敏感性为92.6%,特异性为59.9%。在临界值为2.0时,患病率为1%、10%和20%时的NPV分别为99.9%、98.6%和97%。
利用多中心临床数据开发的、整合了急诊科常见收集的实验室数据的机器学习模型,对COVID-19状态显示出较高的排除准确性,可能有助于指导基于PCR检测的选择性使用。