Naumov A V, Moloshnikov I A, Serenko A V, Sboev A G, Rybka R B
NRC "Kurchatov Institute", Academician Kurchatov sq., 1, Moscow, 123098, Russia.
MEPhI National Research Nuclear University, Kashirskoye sh., 31, Moscow, 115409, Russia.
Procedia Comput Sci. 2021;193:276-284. doi: 10.1016/j.procs.2021.10.028. Epub 2021 Nov 19.
The large amount of data accumulated so far on the dynamics of the COVID-19 outbreak has allowed assessing the accuracy of forecasting methods in retrospect. This work compares several basic time series analysis methods, including machine learning methods, for forecasting the number of confirmed cases for some days ahead. Year-long data for all regions of Russia has been used from the Yandex DataLens platform. As a result, accuracy estimates for these basic methods have been obtained for Russian regions and Russia as a whole, in dependence on the forecasting horizon. The best basic models for forecasting for 14 days are exponential smoothing and ARIMA, with an error of 11-19% by the MAPE metric for the latest part of the course of the epidemic. The accuracies obtained can be considered as baselines for more complex prospective models.
到目前为止,积累的关于新冠疫情动态的大量数据使得能够回顾性地评估预测方法的准确性。这项工作比较了几种基本的时间序列分析方法,包括机器学习方法,以预测未来几天的确诊病例数。使用了来自Yandex DataLens平台的俄罗斯所有地区的全年数据。结果,根据预测期,得出了俄罗斯各地区及整个俄罗斯这些基本方法的准确性估计。预测14天的最佳基本模型是指数平滑法和自回归积分移动平均法(ARIMA),在疫情最新阶段,平均绝对百分比误差(MAPE)指标的误差为11%-19%。所获得的准确性可被视为更复杂的前瞻性模型的基线。