Department of Computer Engineering, School of Engineering, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran.
School of Medicine, Shahroud University of Medical Sciences, Shahroud, Iran.
BMC Public Health. 2021 Jun 7;21(1):1087. doi: 10.1186/s12889-021-11058-3.
The high prevalence of COVID-19 has made it a new pandemic. Predicting both its prevalence and incidence throughout the world is crucial to help health professionals make key decisions. In this study, we aim to predict the incidence of COVID-19 within a two-week period to better manage the disease.
The COVID-19 datasets provided by Johns Hopkins University, contain information on COVID-19 cases in different geographic regions since January 22, 2020 and are updated daily. Data from 252 such regions were analyzed as of March 29, 2020, with 17,136 records and 4 variables, namely latitude, longitude, date, and records. In order to design the incidence pattern for each geographic region, the information was utilized on the region and its neighboring areas gathered 2 weeks prior to the designing. Then, a model was developed to predict the incidence rate for the coming 2 weeks via a Least-Square Boosting Classification algorithm.
The model was presented for three groups based on the incidence rate: less than 200, between 200 and 1000, and above 1000. The mean absolute error of model evaluation were 4.71, 8.54, and 6.13%, respectively. Also, comparing the forecast results with the actual values in the period in question showed that the proposed model predicted the number of globally confirmed cases of COVID-19 with a very high accuracy of 98.45%.
Using data from different geographical regions within a country and discovering the pattern of prevalence in a region and its neighboring areas, our boosting-based model was able to accurately predict the incidence of COVID-19 within a two-week period.
COVID-19 的高患病率使其成为一种新的大流行病。预测全球范围内的患病率和发病率对于帮助卫生专业人员做出关键决策至关重要。在这项研究中,我们旨在预测 COVID-19 在两周内的发病率,以更好地管理这种疾病。
约翰霍普金斯大学提供的 COVID-19 数据集包含自 2020 年 1 月 22 日以来不同地理区域 COVID-19 病例的信息,并且每天都会更新。截至 2020 年 3 月 29 日,分析了来自 252 个此类地区的数据,其中包含 17,136 条记录和 4 个变量,即纬度、经度、日期和记录。为了为每个地理区域设计发病率模式,利用了该区域及其周边地区在设计前两周收集的信息。然后,通过最小二乘提升分类算法开发了一个模型来预测未来两周的发病率。
根据发病率将模型呈现为三组:少于 200、200 至 1000 之间和大于 1000。模型评估的平均绝对误差分别为 4.71、8.54 和 6.13%。此外,将预测结果与同期的实际值进行比较表明,所提出的模型以非常高的准确度(98.45%)预测了全球 COVID-19 确诊病例的数量。
使用一个国家内不同地理区域的数据,并发现一个地区及其周边地区的流行模式,我们的基于提升的模型能够准确预测 COVID-19 在两周内的发病率。