Jain Somit, Agrawal Shobhit, Mohapatra Eshaan, Srinivasan Kathiravan
School of Computer Science and Engineering, Vellore Institute of Technology Vellore India.
Health Care Sci. 2024 Dec 15;3(6):409-425. doi: 10.1002/hcs2.123. eCollection 2024 Dec.
The global impact of the highly contagious COVID-19 virus has created unprecedented challenges, significantly impacting public health and economies worldwide. This research article conducts a time series analysis of COVID-19 data across various countries, including India, Brazil, Russia, and the United States, with a particular emphasis on total confirmed cases.
The proposed approach combines auto-regressive integrated moving average (ARIMA)'s ability to capture linear trends and seasonality with long short-term memory (LSTM) networks, which are designed to learn complex nonlinear dependencies in the data. This hybrid approach surpasses both individual models and existing ARIMA-artificial neural network (ANN) hybrids, which often struggle with highly nonlinear time series like COVID-19 data. By integrating ARIMA and LSTM, the model aims to achieve superior forecasting accuracy compared to baseline models, including ARIMA, Gated Recurrent Unit (GRU), LSTM, and Prophet.
The hybrid ARIMA-LSTM model outperformed the benchmark models, achieving a mean absolute percentage error (MAPE) score of 2.4%. Among the benchmark models, GRU performed the best with a MAPE score of 2.9%, followed by LSTM with a score of 3.6%.
The proposed ARIMA-LSTM hybrid model outperforms ARIMA, GRU, LSTM, Prophet, and the ARIMA-ANN hybrid model when evaluating using metrics like MAPE, symmetric mean absolute percentage error, and median absolute percentage error across all countries analyzed. These findings have the potential to significantly improve preparedness and response efforts by public health authorities, allowing for more efficient resource allocation and targeted interventions.
具有高度传染性的新冠病毒对全球造成的影响带来了前所未有的挑战,严重影响了全球公共卫生和经济。本文对包括印度、巴西、俄罗斯和美国在内的多个国家的新冠病毒数据进行了时间序列分析,特别关注累计确诊病例。
所提出的方法将自回归积分滑动平均模型(ARIMA)捕捉线性趋势和季节性的能力与长短期记忆网络(LSTM)相结合,LSTM旨在学习数据中的复杂非线性依赖关系。这种混合方法超越了单个模型以及现有的ARIMA与人工神经网络(ANN)的混合模型,后者在处理像新冠病毒数据这样的高度非线性时间序列时常常遇到困难。通过整合ARIMA和LSTM,该模型旨在与包括ARIMA、门控循环单元(GRU)、LSTM和Prophet在内的基线模型相比,实现更高的预测准确性。
ARIMA-LSTM混合模型优于基准模型,平均绝对百分比误差(MAPE)得分为2.4%。在基准模型中,GRU表现最佳,MAPE得分为2.9%,其次是LSTM,得分为3.6%。
在使用平均绝对百分比误差、对称平均绝对百分比误差和中位数绝对百分比误差等指标对所有分析国家进行评估时,所提出的ARIMA-LSTM混合模型优于ARIMA、GRU、LSTM、Prophet以及ARIMA-ANN混合模型。这些发现有可能显著提高公共卫生当局的防范和应对能力,实现更有效的资源分配和有针对性的干预措施。