Yanamala Naveena, Krishna Nanda H, Hathaway Quincy A, Radhakrishnan Aditya, Sunkara Srinidhi, Patel Heenaben, Farjo Peter, Patel Brijesh, Sengupta Partho P
Division of Cardiology, West Virginia University Medicine Heart & Vascular Institute, Morgantown, WV, USA.
Institute for Software Research, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
NPJ Digit Med. 2021 Jun 4;4(1):95. doi: 10.1038/s41746-021-00467-8.
Patients with influenza and SARS-CoV2/Coronavirus disease 2019 (COVID-19) infections have a different clinical course and outcomes. We developed and validated a supervised machine learning pipeline to distinguish the two viral infections using the available vital signs and demographic dataset from the first hospital/emergency room encounters of 3883 patients who had confirmed diagnoses of influenza A/B, COVID-19 or negative laboratory test results. The models were able to achieve an area under the receiver operating characteristic curve (ROC AUC) of at least 97% using our multiclass classifier. The predictive models were externally validated on 15,697 encounters in 3125 patients available on TrinetX database that contains patient-level data from different healthcare organizations. The influenza vs COVID-19-positive model had an AUC of 98.8%, and 92.8% on the internal and external test sets, respectively. Our study illustrates the potentials of machine-learning models for accurately distinguishing the two viral infections. The code is made available at https://github.com/ynaveena/COVID-19-vs-Influenza and may have utility as a frontline diagnostic tool to aid healthcare workers in triaging patients once the two viral infections start cocirculating in the communities.
流感患者与感染严重急性呼吸综合征冠状病毒2(SARS-CoV2)/2019冠状病毒病(COVID-19)的患者有着不同的临床病程和预后。我们开发并验证了一种监督式机器学习流程,利用3883例确诊为甲型/乙型流感、COVID-19或实验室检测结果为阴性的患者首次在医院/急诊室就诊时的可用生命体征和人口统计学数据集来区分这两种病毒感染。使用我们的多类分类器,这些模型能够在受试者工作特征曲线下面积(ROC AUC)至少达到97%。预测模型在TrinetX数据库中3125例患者的15697次就诊数据上进行了外部验证,该数据库包含来自不同医疗机构的患者层面数据。流感与COVID-19阳性模型在内部和外部测试集上的AUC分别为98.8%和92.8%。我们的研究说明了机器学习模型在准确区分这两种病毒感染方面的潜力。代码可在https://github.com/ynaveena/COVID-19-vs-Influenza获取,一旦这两种病毒感染开始在社区共同传播,该代码可能作为一种一线诊断工具,帮助医护人员对患者进行分诊。