AbbVie Inc., North Chicago, IL, United States.
Sunnybrook Health Sciences Center, Toronto, ON, Canada.
Front Public Health. 2023 Dec 7;11:1232531. doi: 10.3389/fpubh.2023.1232531. eCollection 2023.
The COVID-19 pandemic has caused over 6 million deaths worldwide and is a significant cause of mortality. Mortality dynamics vary significantly by country due to pathogen, host, social and environmental factors, in addition to vaccination and treatments. However, there is limited data on the relative contribution of different explanatory variables, which may explain changes in mortality over time. We, therefore, created a predictive model using orthogonal machine learning techniques to attempt to quantify the contribution of static and dynamic variables over time.
A model was created using Partial Least Squares Regression trained on data from 2020 to rank order the significance and effect size of static variables on mortality per country. This model enables the prediction of mortality levels for countries based on demographics alone. Partial Least Squares Regression was then used to quantify how dynamic variables, including weather and non-pharmaceutical interventions, contributed to the overall mortality in 2020. Finally, mortality levels for the first 60 days of 2021 were predicted using rolling-window Elastic Net regression.
This model allowed prediction of deaths per day and quantification of the degree of influence of included variables, accounting for timing of occurrence or implementation. We found that the most parsimonious model could be reduced to six variables; three policy-related variables - COVID-19 testing policy, canceled public events policy, workplace closing policy; in addition to three environmental variables - maximum temperature per day, minimum temperature per day, and the dewpoint temperature per day.
Country and population-level static and dynamic variables can be used to predict COVID-19 mortality, providing an example of how broad temporal data can inform a preparation and mitigation strategy for both COVID-19 and future pandemics and assist decision-makers by identifying population-level contributors, including interventions, that have the greatest influence in mitigating mortality, and optimizing the health and safety of populations.
COVID-19 大流行已在全球范围内导致超过 600 万人死亡,是一个重要的死亡原因。由于病原体、宿主、社会和环境因素,以及疫苗接种和治疗,死亡率在各国之间存在显著差异。然而,关于不同解释变量的相对贡献的数据有限,这些变量可能可以解释死亡率随时间的变化。因此,我们使用正交机器学习技术创建了一个预测模型,试图量化静态和动态变量随时间的变化对死亡率的贡献。
使用偏最小二乘回归(PLSR)模型对 2020 年的数据进行训练,对各国死亡率的静态变量的重要性和效应大小进行排序。该模型使我们能够仅根据人口统计学数据预测各国的死亡率水平。然后,使用偏最小二乘回归(PLSR)来量化包括天气和非药物干预在内的动态变量对 2020 年总死亡率的贡献。最后,使用滚动窗口弹性网络回归(Elastic Net regression)预测 2021 年前 60 天的死亡率。
该模型允许预测每天的死亡人数,并量化包括变量的发生时间或实施时间在内的影响程度。我们发现,最简约的模型可以简化为六个变量:三个与政策相关的变量 - COVID-19 检测政策、取消公共活动政策、工作场所关闭政策;此外,还有三个环境变量 - 每日最高温度、每日最低温度和每日露点温度。
可以使用国家和人口水平的静态和动态变量来预测 COVID-19 死亡率,这为如何利用广泛的时间数据为 COVID-19 和未来大流行的准备和缓解策略提供了一个范例,并通过确定对死亡率有最大影响的人口水平干预措施,包括干预措施,来协助决策者,优化人群的健康和安全。