Bani-Yaghoub Majid, Rekab Kamel, Pluta Julia, Tabharit Said
Division of Computing, Analytics and Mathematics, School of Science and Engineering, University of Missouri-Kansas City, Kansas City, MO 64110, USA.
Mathematics (Basel). 2025;13(2). doi: 10.3390/math13020180.
Spatial, temporal, and space-time scan statistics can be used for geographical surveillance, identifying temporal and spatial patterns, and detecting outliers. While statistical cluster analysis is a valuable tool for identifying patterns, optimizing resource allocation, and supporting decision-making, accurately predicting future spatial clusters remains a significant challenge. Given the known relative risks of spatial clusters over the past time intervals, the main objective of the present study is to predict the relative risks for the subsequent interval, . Building on our prior research, we propose a predictive Markov chain model with an embedded corrector component. This corrector utilizes either multiple linear regression or exponential smoothing method, selecting the one that minimizes the relative distance between observed and predicted values in the -th interval. To test the proposed method, we first calculated the relative risks of statistically significant spatial clusters of COVID-19 mortality in the U.S. over seven time intervals from May 2020 to March 2023. Then, for each time interval, we selected the top 25 clusters with the highest relative risks and iteratively predicted the relative risks of clusters from intervals three to seven. The predictive accuracies ranged from moderate to high, indicating the potential applicability of this method for predictive disease analytics and future pandemic preparedness.
空间、时间和时空扫描统计可用于地理监测、识别时空模式以及检测异常值。虽然统计聚类分析是识别模式、优化资源分配和支持决策的宝贵工具,但准确预测未来的空间聚类仍然是一项重大挑战。鉴于过去时间间隔内空间聚类的已知相对风险,本研究的主要目标是预测后续间隔的相对风险。基于我们之前的研究,我们提出了一种带有嵌入式校正器组件的预测马尔可夫链模型。该校正器使用多元线性回归或指数平滑方法,选择使第 个间隔内观测值与预测值之间相对距离最小的方法。为了测试所提出的方法,我们首先计算了2020年5月至2023年3月七个时间间隔内美国新冠肺炎死亡率具有统计学意义的空间聚类的相对风险。然后,对于每个时间间隔,我们选择相对风险最高的前25个聚类,并迭代预测从第三个到第七个间隔的聚类相对风险。预测准确率从中等到高,表明该方法在预测疾病分析和未来大流行防范方面具有潜在的适用性。