Latif Siddique, Usman Muhammad, Manzoor Sanaullah, Iqbal Waleed, Qadir Junaid, Tyson Gareth, Castro Ignacio, Razi Adeel, Boulos Maged N Kamel, Weller Adrian, Crowcroft Jon
University of Southern Queensland Springfield Queensland 4300 Australia.
Distributed Sensing Systems Group, Data61CSIRO Pullenvale QLD 4069 Australia.
IEEE Trans Artif Intell. 2020 Sep 2;1(1):85-103. doi: 10.1109/TAI.2020.3020521. eCollection 2020 Aug.
COVID-19, an infectious disease caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organisation (WHO) in March 2020. By mid-August 2020, more than 21 million people have tested positive worldwide. Infections have been growing rapidly and tremendous efforts are being made to fight the disease. In this paper, we attempt to systematise the various COVID-19 research activities leveraging data science, where we define data science broadly to encompass the various methods and tools-including those from artificial intelligence (AI), machine learning (ML), statistics, modeling, simulation, and data visualization-that can be used to store, process, and extract insights from data. In addition to reviewing the rapidly growing body of recent research, we survey public datasets and repositories that can be used for further work to track COVID-19 spread and mitigation strategies. As part of this, we present a bibliometric analysis of the papers produced in this short span of time. Finally, building on these insights, we highlight common challenges and pitfalls observed across the surveyed works. We also created a live resource repository at https://github.com/Data-Science-and-COVID-19/Leveraging-Data-Science-To-Combat-COVID-19-A-Comprehensive-Review that we intend to keep updated with the latest resources including new papers and datasets.
2019冠状病毒病(COVID-19)是一种由严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引起的传染病,2020年3月被世界卫生组织(WHO)宣布为大流行病。截至2020年8月中旬,全球已有超过2100万人检测呈阳性。感染人数一直在迅速增长,目前正在做出巨大努力来抗击这种疾病。在本文中,我们试图利用数据科学将各种COVID-19研究活动系统化,在这里我们对数据科学进行广义定义,以涵盖各种方法和工具,包括来自人工智能(AI)、机器学习(ML)、统计学、建模、模拟和数据可视化等可用于存储、处理和从数据中提取见解的方法和工具。除了回顾近期迅速增长的研究成果外,我们还调查了可用于进一步跟踪COVID-19传播和缓解策略的公共数据集和存储库。作为其中一部分,我们对这段短时间内发表的论文进行了文献计量分析。最后,基于这些见解,我们突出了在所调查的研究中观察到的常见挑战和陷阱。我们还在https://github.com/Data-Science-and-COVID-19/Leveraging-Data-Science-To-Combat-COVID-19-A-Comprehensive-Review上创建了一个实时资源库,我们打算不断更新最新资源,包括新论文和数据集。