Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, Republic of Korea.
Department of Industrial Engineering, Seoul National University, Seoul, Republic of Korea.
Sci Rep. 2024 Apr 30;14(1):9962. doi: 10.1038/s41598-024-58835-9.
The COVID-19 pandemic caused by the novel SARS-COV-2 virus poses a great risk to the world. During the COVID-19 pandemic, observing and forecasting several important indicators of the epidemic (like new confirmed cases, new cases in intensive care unit, and new deaths for each day) helped prepare the appropriate response (e.g., creating additional intensive care unit beds, and implementing strict interventions). Various predictive models and predictor variables have been used to forecast these indicators. However, the impact of prediction models and predictor variables on forecasting performance has not been systematically well analyzed. Here, we compared the forecasting performance using a linear mixed model in terms of prediction models (mathematical, statistical, and AI/machine learning models) and predictor variables (vaccination rate, stringency index, and Omicron variant rate) for seven selected countries with the highest vaccination rates. We decided on our best models based on the Bayesian Information Criterion (BIC) and analyzed the significance of each predictor. Simple models were preferred. The selection of the best prediction models and the use of Omicron variant rate were considered essential in improving prediction accuracies. For the test data period before Omicron variant emergence, the selection of the best models was the most significant factor in improving prediction accuracy. For the test period after Omicron emergence, Omicron variant rate use was considered essential in deciding forecasting accuracy. For prediction models, ARIMA, lightGBM, and TSGLM generally performed well in both test periods. Linear mixed models with country as a random effect has proven that the choice of prediction models and the use of Omicron data was significant in determining forecasting accuracies for the highly vaccinated countries. Relatively simple models, fit with either prediction model or Omicron data, produced best results in enhancing forecasting accuracies with test data.
由新型 SARS-COV-2 病毒引起的 COVID-19 大流行对世界构成了巨大威胁。在 COVID-19 大流行期间,观察和预测疫情的几个重要指标(如每日新增确诊病例、重症监护病房新增病例和新增死亡病例)有助于做出适当的应对措施(例如,增加重症监护病房床位和实施严格的干预措施)。已经使用了各种预测模型和预测变量来预测这些指标。然而,预测模型和预测变量对预测性能的影响尚未得到系统的很好分析。在这里,我们比较了使用线性混合模型在预测模型(数学、统计和 AI/机器学习模型)和预测变量(疫苗接种率、严格指数和奥密克戎变体率)方面对七个疫苗接种率最高的国家的预测性能。我们根据贝叶斯信息准则(BIC)选择了最佳模型,并分析了每个预测变量的显著性。优先选择简单的模型。选择最佳预测模型和使用奥密克戎变体率被认为是提高预测精度的关键。对于奥密克戎变体出现之前的测试数据期间,选择最佳模型是提高预测精度的最重要因素。对于奥密克戎出现之后的测试期间,使用奥密克戎变体率是决定预测准确性的关键。对于预测模型,ARIMA、lightGBM 和 TSGLM 在两个测试期间通常表现良好。具有国家作为随机效应的线性混合模型证明,在确定高接种率国家的预测精度时,选择预测模型和使用奥密克戎数据是至关重要的。相对简单的模型,适合使用预测模型或奥密克戎数据,在使用测试数据提高预测精度方面产生了最佳结果。