DISCo, Università degli Studi di Milano-Bicocca, Milan, Italy.
IRCCS Istituto Ortopedico Galeazzi, Laboratory of Clinical Chemistry and Microbiology, Milan, Italy.
Clin Chem Lab Med. 2020 Oct 21;59(2):421-431. doi: 10.1515/cclm-2020-1294.
The rRT-PCR test, the current gold standard for the detection of coronavirus disease (COVID-19), presents with known shortcomings, such as long turnaround time, potential shortage of reagents, false-negative rates around 15-20%, and expensive equipment. The hematochemical values of routine blood exams could represent a faster and less expensive alternative.
Three different training data set of hematochemical values from 1,624 patients (52% COVID-19 positive), admitted at San Raphael Hospital (OSR) from February to May 2020, were used for developing machine learning (ML) models: the complete OSR dataset (72 features: complete blood count (CBC), biochemical, coagulation, hemogasanalysis and CO-Oxymetry values, age, sex and specific symptoms at triage) and two sub-datasets (COVID-specific and CBC dataset, 32 and 21 features respectively). 58 cases (50% COVID-19 positive) from another hospital, and 54 negative patients collected in 2018 at OSR, were used for internal-external and external validation.
We developed five ML models: for the complete OSR dataset, the area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.83 to 0.90; for the COVID-specific dataset from 0.83 to 0.87; and for the CBC dataset from 0.74 to 0.86. The validations also achieved good results: respectively, AUC from 0.75 to 0.78; and specificity from 0.92 to 0.96.
ML can be applied to blood tests as both an adjunct and alternative method to rRT-PCR for the fast and cost-effective identification of COVID-19-positive patients. This is especially useful in developing countries, or in countries facing an increase in contagions.
实时逆转录聚合酶链反应(rRT-PCR)检测是目前冠状病毒疾病(COVID-19)检测的金标准,但存在诸多缺陷,如检测时间长、试剂潜在短缺、假阴性率约为 15-20%,以及设备昂贵等。常规血液检查的血液化学值可能是一种更快、更经济的替代方法。
我们使用了来自圣拉斐尔医院(OSR) 2020 年 2 月至 5 月收治的 1624 名患者(52%COVID-19 阳性)的三种不同的血液化学值训练数据集来开发机器学习(ML)模型:完整的 OSR 数据集(72 个特征:全血细胞计数(CBC)、生化、凝血、血气分析和 CO-Oxymetry 值、年龄、性别和分诊时的特定症状)和两个子数据集(COVID 特异性和 CBC 数据集,分别有 32 个和 21 个特征)。另外,我们还使用了另一所医院的 58 例(50%COVID-19 阳性)病例和 OSR 在 2018 年收集的 54 例阴性患者进行内部和外部验证。
我们开发了五个 ML 模型:对于完整的 OSR 数据集,算法的接收者操作特征曲线(ROC)下面积(AUC)范围为 0.83 至 0.90;对于 COVID 特异性数据集,AUC 范围为 0.83 至 0.87;对于 CBC 数据集,AUC 范围为 0.74 至 0.86。验证也取得了较好的结果:AUC 分别为 0.75 至 0.78,特异性分别为 0.92 至 0.96。
ML 可以应用于血液检测,作为 rRT-PCR 的辅助和替代方法,用于快速、经济地识别 COVID-19 阳性患者。这在发展中国家或面临感染增加的国家尤其有用。