Shang Aaron C, Galow Kristen E, Galow Gary G
University of Oxford Medical Sciences Division; Oxford OX3 9DU, UK.
Hackensack Meridian School of Medicine; Nutley, NJ 07110, USA.
AIMS Public Health. 2021 Feb 1;8(1):124-136. doi: 10.3934/publichealth.2021010. eCollection 2021.
The COVID-19 pandemic (caused by SARS-CoV-2) has introduced significant challenges for accurate prediction of population morbidity and mortality by traditional variable-based methods of estimation. Challenges to modelling include inadequate viral physiology comprehension and fluctuating definitions of positivity between national-to-international data. This paper proposes that accurate forecasting of COVID-19 caseload may be best preformed non-parametrically, by vector autoregression (VAR) of verifiable data regionally.
A non-linear VAR model across 7 major demographically representative New York City (NYC) metropolitan region counties was constructed using verifiable daily COVID-19 caseload data March 12-July 23, 2020. Through association of observed case trends with a series of (county-specific) data-driven dynamic interdependencies (lagged values), a systematically non-assumptive approximation of VAR representation for COVID-19 patterns to-date and prospective upcoming trends was produced.
Modified VAR regression of NYC area COVID-19 caseload trends proves highly significant modelling capacity of observed patterns in longitudinal disease incidence (county R range: 0.9221-0.9751, all p < 0.001). Predictively, VAR regression of daily caseload results at a county-wide level demonstrates considerable short-term forecasting fidelity (p < 0.001 at one-step ahead) with concurrent capacity for longer-term (tested 11-week period) inferences of consistent, reasonable upcoming patterns from latest (model data update) disease epidemiology.
In contrast to macroscopic variable-assumption projections, regionally-founded VAR modelling may substantially improve projection of short-term community disease burden, reduce potential for biostatistical error, as well as better model epidemiological effects resultant from intervention. Predictive VAR extrapolation of existing public health data at an interdependent regional scale may improve accuracy of current pandemic burden prognoses.
2019冠状病毒病(由严重急性呼吸综合征冠状病毒2引起)给通过传统的基于变量的估计方法准确预测人群发病率和死亡率带来了重大挑战。建模面临的挑战包括对病毒生理学理解不足以及国家和国际数据之间阳性定义的波动。本文提出,通过对可验证的区域数据进行向量自回归(VAR),以非参数方式对2019冠状病毒病病例数进行准确预测可能是最佳方法。
利用2020年3月12日至7月23日可验证的每日2019冠状病毒病病例数数据,构建了一个涵盖纽约市(NYC)大都市区7个主要人口统计学代表性县的非线性VAR模型。通过将观察到的病例趋势与一系列(特定县的)数据驱动的动态相互依赖关系(滞后值)相关联,对2019冠状病毒病迄今的模式和未来即将出现的趋势进行了系统的非假设性VAR表示近似。
纽约市地区2019冠状病毒病病例数趋势的修正VAR回归证明,在纵向疾病发病率方面,对观察到的模式具有高度显著的建模能力(县R范围:0.9221 - 0.9751,所有p < 0.001)。在预测方面,全县范围内每日病例数结果的VAR回归显示出相当高的短期预测保真度(提前一步时p < 0.001),同时能够对最新(模型数据更新)疾病流行病学中一致、合理的未来模式进行长期(测试11周期间)推断。
与宏观变量假设预测相比,基于区域的VAR建模可能会显著改善短期社区疾病负担的预测,降低生物统计误差的可能性,并更好地模拟干预产生的流行病学效应。在相互依赖的区域尺度上对现有公共卫生数据进行预测性VAR外推,可能会提高当前大流行负担预后的准确性。