Alharbi Nader
King Saud bin Abdulaziz University for Health Sciences King Abdullah International Medical Research Center Riyadh Saudi Arabia.
JMIRx Med. 2021 Mar 31;2(1):e21044. doi: 10.2196/21044. eCollection 2021 Jan-Mar.
Infectious disease is one of the main issues that threatens human health worldwide. The 2019 outbreak of the new coronavirus SARS-CoV-2, which causes the disease COVID-19, has become a serious global pandemic. Many attempts have been made to forecast the spread of the disease using various methods, including time series models. Among the attempts to model the pandemic, to the best of our knowledge, no studies have used the singular spectrum analysis (SSA) technique to forecast confirmed cases.
The primary objective of this paper is to construct a reliable, robust, and interpretable model for describing, decomposing, and forecasting the number of confirmed cases of COVID-19 and predicting the peak of the pandemic in Saudi Arabia.
A modified singular spectrum analysis (SSA) approach was applied for the analysis of the COVID-19 pandemic in Saudi Arabia. We proposed this approach and developed it in our previous studies regarding the separability and grouping steps in SSA, which play important roles in reconstruction and forecasting. The modified SSA approach mainly enables us to identify the number of interpretable components required for separability, signal extraction, and noise reduction. The approach was examined using different levels of simulated and real data with different structures and signal-to-noise ratios. In this study, we examined the capability of the approach to analyze COVID-19 data. We then used vector SSA to predict new data points and the peak of the pandemic in Saudi Arabia.
In the first stage, the confirmed daily cases on the first 42 days (March 02 to April 12, 2020) were used and analyzed to identify the value of the number of required eigenvalues () for separability between noise and signal. After obtaining the value of , which was 2, and extracting the signals, vector SSA was used to predict and determine the pandemic peak. In the second stage, we updated the data and included 81 daily case values. We used the same window length and number of eigenvalues for reconstruction and forecasting of the points 90 days ahead. The results of both forecasting scenarios indicated that the peak would occur around the end of May or June 2020 and that the crisis would end between the end of June and the middle of August 2020, with a total number of infected people of approximately 330,000.
Our results confirm the impressive performance of modified SSA in analyzing COVID-19 data and selecting the value of for identifying the signal subspace from a noisy time series and then making a reliable prediction of daily confirmed cases using the vector SSA method.
传染病是全球威胁人类健康的主要问题之一。2019年新型冠状病毒SARS-CoV-2爆发,引发了COVID-19疾病,已成为严重的全球大流行。人们已经尝试使用各种方法,包括时间序列模型,来预测该疾病的传播。在对这一流行病进行建模的尝试中,据我们所知,尚无研究使用奇异谱分析(SSA)技术来预测确诊病例。
本文的主要目的是构建一个可靠、稳健且可解释的模型,用于描述、分解和预测COVID-19确诊病例数,并预测沙特阿拉伯大流行的峰值。
应用一种改进的奇异谱分析(SSA)方法来分析沙特阿拉伯的COVID-19大流行情况。我们提出了这种方法,并在之前关于SSA中可分离性和分组步骤的研究中对其进行了改进,这些步骤在重建和预测中起着重要作用。改进的SSA方法主要使我们能够确定可分离性、信号提取和降噪所需的可解释成分数量。该方法使用了具有不同结构和信噪比的不同水平的模拟数据和真实数据进行检验。在本研究中,我们检验了该方法分析COVID-19数据的能力。然后,我们使用向量SSA来预测新的数据点以及沙特阿拉伯大流行的峰值。
在第一阶段,使用并分析了前42天(2020年3月2日至4月12日)的每日确诊病例,以确定噪声与信号分离所需的特征值数量()的值。在获得值为2并提取信号后,使用向量SSA来预测和确定大流行峰值。在第二阶段,我们更新了数据,纳入了81个每日病例值。我们使用相同的窗口长度和特征值数量对未来90天的点进行重建和预测。两种预测方案的结果均表明,峰值将出现在2020年5月底或6月左右,危机将在2020年6月底至8月中旬结束,感染总人数约为33万。
我们的结果证实了改进的SSA在分析COVID-19数据以及选择用于从有噪声的时间序列中识别信号子空间的 值,然后使用向量SSA方法对每日确诊病例进行可靠预测方面的出色表现。