Zhou Xianlong, Wang Zhichao, Li Shaoping, Liu Tanghai, Wang Xiaolin, Xia Jian, Zhao Yan
Emergency Center, Hubei Clinical Research Center for Emergency and Resuscitation, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, 430071, People's Republic of China.
Emergency Department, Wuhan No. 1 Hospital, Wuhan, Hubei, 430022, People's Republic of China.
Risk Manag Healthc Policy. 2021 Feb 15;14:595-604. doi: 10.2147/RMHP.S291498. eCollection 2021.
Considering the current situation of the novel coronavirus disease (COVID-19) epidemic control, it is highly likely that COVID-19 and influenza may coincide during the approaching winter season. However, there is no available tool that can rapidly and precisely distinguish between these two diseases in the absence of laboratory evidence of specific pathogens.
Laboratory-confirmed COVID-19 and influenza patients between December 1, 2019 and February 29, 2020, from Zhongnan Hospital of Wuhan University (ZHWU) and Wuhan No.1 Hospital (WNH) located in Wuhan, China, were included for analysis. A machine learning-based decision model was developed using the XGBoost algorithms.
Data of 357 COVID-19 and 1893 influenza patients from ZHWU were split into a training and a testing set in the ratio 7:3, while the dataset from WNH (308 COVID-19 and 312 influenza patients) was preserved for an external test. Model-based decision tree selected age, serum high-sensitivity C-reactive protein and circulating monocytes as meaningful indicators for classifying COVID-19 and influenza cases. In the training, testing and external sets, the model achieved good performance in identifying COVID-19 from influenza cases with a corresponding area under the receiver operating characteristic curve (AUC) of 0.94 (95% CI 0.93, 0.96), 0.93 (95% CI 0.90, 0.96), and 0.84 (95% CI: 0.81, 0.87), respectively.
Machine learning provides a tool that can rapidly and accurately distinguish between COVID-19 and influenza cases. This finding would be particularly useful in regions with massive co-occurrences of COVID-19 and influenza cases while limited resources for laboratory testing of specific pathogens.
考虑到新型冠状病毒肺炎(COVID-19)疫情防控的现状,在即将到来的冬季,COVID-19与流感很可能同时出现。然而,在缺乏特定病原体实验室证据的情况下,尚无能够快速、准确区分这两种疾病的工具。
纳入2019年12月1日至2020年2月29日期间来自中国武汉武汉大学中南医院(ZHWU)和武汉市第一医院(WNH)的实验室确诊的COVID-19患者和流感患者进行分析。使用XGBoost算法开发了一种基于机器学习的决策模型。
将ZHWU的357例COVID-19患者和1893例流感患者的数据按7:3的比例分为训练集和测试集,而将WNH的数据集(308例COVID-19患者和312例流感患者)留作外部测试。基于模型的决策树选择年龄、血清高敏C反应蛋白和循环单核细胞作为区分COVID-19病例和流感病例的有意义指标。在训练集、测试集和外部数据集中,该模型在从流感病例中识别COVID-19方面表现良好,相应的受试者工作特征曲线下面积(AUC)分别为0.94(95%CI 0.93, 0.96)、0.93(95%CI 0.90, 0.96)和0.84(95%CI: 0.81, 0.87)。
机器学习提供了一种能够快速、准确区分COVID-19病例和流感病例的工具。这一发现对于COVID-19和流感病例大量同时出现而特定病原体实验室检测资源有限的地区尤为有用。