基于症状的用于COVID-19早期诊断的机器学习网络应用程序的开发。

Development of a Machine Learning Based Web Application for Early Diagnosis of COVID-19 Based on Symptoms.

作者信息

Villavicencio Charlyn Nayve, Macrohon Julio Jerison, Inbaraj Xavier Alphonse, Jeng Jyh-Horng, Hsieh Jer-Guang

机构信息

Department of Information Engineering, I-Shou University, Kaohsiung City 84001, Taiwan.

College of Information and Communications Technology, Bulacan State University, Malolos City 3000, Philippines.

出版信息

Diagnostics (Basel). 2022 Mar 27;12(4):821. doi: 10.3390/diagnostics12040821.

DOI:10.3390/diagnostics12040821

PMID:35453869

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9026809/

Abstract

Detecting the presence of a disease requires laboratory tests, testing kits, and devices; however, these were not always available on hand. This study proposes a new approach in disease detection using machine learning algorithms by analyzing symptoms experienced by a person without requiring laboratory tests. Six supervised machine learning algorithms such as J48 decision tree, random forest, support vector machine, k-nearest neighbors, naïve Bayes algorithms, and artificial neural networks were applied in the "COVID-19 Symptoms and Presence Dataset" from Kaggle. Through hyperparameter optimization and 10-fold cross validation, we attained the highest possible performance of each algorithm. A comparative analysis was performed according to accuracy, sensitivity, specificity, and area under the ROC curve. Results show that random forest, support vector machine, k-nearest neighbors, and artificial neural networks outweighed other algorithms by attaining 98.84% accuracy, 100% sensitivity, 98.79% specificity, and 98.84% area under the ROC curve. Finally, we developed a web application that will allow users to select symptoms currently being experienced, and use it to predict the presence of COVID-19 through the developed prediction model. Based on this mechanism, the proposed method can effectively predict the presence or absence of COVID-19 in a person immediately without using laboratory tests, kits, and devices in a real-time manner.

摘要

疾病检测需要实验室检测、检测试剂盒和设备；然而，这些并非总是随手可得。本研究提出了一种利用机器学习算法进行疾病检测的新方法，通过分析个人经历的症状，无需实验室检测。六种监督式机器学习算法，如J48决策树、随机森林、支持向量机、k近邻、朴素贝叶斯算法和人工神经网络，被应用于来自Kaggle的“COVID-19症状与患病数据集”。通过超参数优化和10折交叉验证，我们获得了每种算法的最高性能。根据准确率、灵敏度、特异性和ROC曲线下面积进行了对比分析。结果表明，随机森林、支持向量机、k近邻和人工神经网络的表现优于其他算法，其准确率达到98.84%，灵敏度达到100%，特异性达到98.79%，ROC曲线下面积达到98.84%。最后，我们开发了一个网络应用程序，允许用户选择当前正在经历的症状，并通过开发的预测模型来预测COVID-19的患病情况。基于此机制，所提出的方法可以在不使用实验室检测、试剂盒和设备的情况下，即时有效地预测一个人是否感染COVID-19。