Goo Taewan, Apio Catherine, Heo Gyujin, Lee Doeun, Lee Jong Hyeok, Lim Jisun, Han Kyulhee, Park Taesung
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea.
Department of Statistics, Seoul National University, Seoul 08826, Korea.
Genomics Inform. 2021 Mar;19(1):e11. doi: 10.5808/gi.21028. Epub 2021 Mar 25.
For the novel coronavirus disease 2019 (COVID-19), predictive modeling, in the literature, uses broadly susceptible exposed infected recoverd (SEIR)/SIR, agent-based, curve-fitting models. Governments and legislative bodies rely on insights from prediction models to suggest new policies and to assess the effectiveness of enforced policies. Therefore, access to accurate outbreak prediction models is essential to obtain insights into the likely spread and consequences of infectious diseases. The objective of this study is to predict the future COVID-19 situation of Korea. Here, we employed 5 models for this analysis; SEIR, local linear regression (LLR), negative binomial (NB) regression, segment Poisson, deep-learning based long short-term memory models (LSTM) and tree based gradient boosting machine (GBM). After prediction, model performance comparison was evelauated using relative mean squared errors (RMSE) for two sets of train (January 20, 2020‒December 31, 2020 and January 20, 2020‒January 31, 2021) and testing data (January 1, 2021‒February 28, 2021 and February 1, 2021‒February 28, 2021) . Except for segmented Poisson model, the other models predicted a decline in the daily confirmed cases in the country for the coming future. RMSE values' comparison showed that LLR, GBM, SEIR, NB, and LSTM respectively, performed well in the forecasting of the pandemic situation of the country. A good understanding of the epidemic dynamics would greatly enhance the control and prevention of COVID-19 and other infectious diseases. Therefore, with increasing daily confirmed cases since this year, these results could help in the pandemic response by informing decisions about planning, resource allocation, and decision concerning social distancing policies.
对于2019年新型冠状病毒病(COVID-19),文献中的预测模型广泛使用易感-暴露-感染-康复(SEIR)/SIR模型、基于主体的模型和曲线拟合模型。政府和立法机构依靠预测模型的见解来提出新政策并评估实施政策的有效性。因此,获得准确的疫情预测模型对于深入了解传染病可能的传播和后果至关重要。本研究的目的是预测韩国未来的COVID-19情况。在此,我们采用了5种模型进行分析;SEIR模型、局部线性回归(LLR)模型、负二项式(NB)回归模型、分段泊松模型、基于深度学习的长短期记忆模型(LSTM)和基于树的梯度提升机(GBM)。预测后,使用两组训练数据(2020年1月20日至2020年12月31日以及2020年1月20日至2021年1月31日)和测试数据(2021年1月1日至2021年2月28日以及2021年2月1日至2021年2月28日)的相对均方误差(RMSE)来评估模型性能比较。除分段泊松模型外,其他模型预测该国未来每日确诊病例数将下降。RMSE值的比较表明,LLR模型、GBM模型、SEIR模型、NB模型和LSTM模型在该国疫情形势预测方面分别表现良好。对疫情动态的深入了解将极大地加强COVID-19及其他传染病的防控。因此,鉴于今年以来每日确诊病例数不断增加,这些结果有助于通过为规划、资源分配决策以及有关社会 distancing政策的决策提供信息来应对疫情。