Alballa Norah, Al-Turaiki Isra
Computer Science Department, College of Computer and Information Sciences, King Saud University, Saudi Arabia.
Information Technology Department, College of Computer and Information Sciences, King Saud University, Saudi Arabia.
Inform Med Unlocked. 2021;24:100564. doi: 10.1016/j.imu.2021.100564. Epub 2021 Apr 3.
The existence of widespread COVID-19 infections has prompted worldwide efforts to control and manage the virus, and hopefully curb it completely. One important line of research is the use of (ML) to understand and fight COVID-19. This is currently an active research field. Although there are already many surveys in the literature, there is a need to keep up with the rapidly growing number of publications on COVID-19-related applications of ML. This paper presents a review of recent reports on ML algorithms used in relation to COVID-19. We focus on the potential of ML for two main applications: diagnosis of COVID-19 and prediction of mortality risk and severity, using readily available clinical and laboratory data. Aspects related to algorithm types, training data sets, and feature selection are discussed. As we cover work published between January 2020 and January 2021, a few key points have come to light. The bulk of the machine learning algorithms used in these two applications are supervised learning algorithms. The established models are yet to be used in real-world implementations, and much of the associated research is experimental. The diagnostic and prognostic features discovered by ML models are consistent with results presented in the medical literature. A limitation of the existing applications is the use of imbalanced data sets that are prone to selection bias.
广泛存在的新冠病毒感染促使全球努力控制和管理该病毒,并有望将其彻底遏制。一个重要的研究方向是利用机器学习(ML)来了解和对抗新冠病毒。这是当前一个活跃的研究领域。尽管文献中已经有很多综述,但仍有必要跟上与新冠病毒相关的机器学习应用方面迅速增长的出版物数量。本文对近期有关用于新冠病毒的机器学习算法的报告进行了综述。我们关注机器学习在两个主要应用方面的潜力:利用现成的临床和实验室数据诊断新冠病毒以及预测死亡风险和严重程度。讨论了与算法类型、训练数据集和特征选择相关的方面。在涵盖2020年1月至2021年1月期间发表的研究时,有几个关键点凸显出来。这两个应用中使用的大部分机器学习算法都是监督学习算法。已建立的模型尚未用于实际应用,并且许多相关研究都是实验性的。机器学习模型发现的诊断和预后特征与医学文献中的结果一致。现有应用的一个局限性是使用了容易产生选择偏差的不平衡数据集。