Suppr超能文献

一种利用繁殖数估计值和人口数据预测区域新冠病毒病例的随机森林模型。

A random forest model for forecasting regional COVID-19 cases utilizing reproduction number estimates and demographic data.

作者信息

Galasso Joseph, Cao Duy M, Hochberg Robert

机构信息

Department of Biology #11, University of Dallas, Irving, TX 75062, USA.

Department of Computer Science #134, University of Dallas, Irving, TX 75062, USA.

出版信息

Chaos Solitons Fractals. 2022 Mar;156:111779. doi: 10.1016/j.chaos.2021.111779. Epub 2022 Jan 5.

Abstract

During the COVID-19 pandemic, predicting case spikes at the local level is important for a precise, targeted public health response and is generally done with compartmental models. The performance of compartmental models is highly dependent on the accuracy of their assumptions about disease dynamics within a population; thus, such models are susceptible to human error, unexpected events, or unknown characteristics of a novel infectious agent like COVID-19. We present a relatively non-parametric random forest model that forecasts the number of COVID-19 cases at the U.S. county level. Its most prioritized training features are derived from easily accessible, standard epidemiological data (i.e., regional test positivity rate) and the effective reproduction number ( ) from compartmental models. A novel input training feature is case projections generated by aligning estimated effective reproduction number (pre-computed by COVIDActNow.org) with real time testing data until maximally correlated, helping our model fit better to the epidemic's trajectory as ascertained by traditional models. Poor reliability of is partially mitigated with dynamic population mobility and prevalence and mortality of non-COVID-19 diseases to gauge population disease susceptibility. The model was used to generate forecasts for 1, 2, 3, and 4 weeks into the future for each reference week within 11/01/2020 - 01/10/2021 for 3068 counties. Over this time period, it maintained a mean absolute error (MAE) of less than 300 weekly cases/100,000 and consistently outperformed or performed comparably with gold-standard compartmental models. Furthermore, it holds great potential in ensemble modeling due to its potential for a more expansive training feature set while maintaining good performance and limited resource utilization.

摘要

在新冠疫情期间,预测地方层面的病例激增对于精准、有针对性的公共卫生应对措施至关重要,且通常使用 compartmental 模型来进行预测。compartmental 模型的性能高度依赖于其对人群中疾病动态假设的准确性;因此,此类模型容易受到人为误差、意外事件或像新冠病毒这样新型感染因子未知特征的影响。我们提出了一种相对非参数的随机森林模型,用于预测美国县级层面的新冠病例数。其最优先的训练特征源自易于获取的标准流行病学数据(即区域检测阳性率)以及 compartmental 模型的有效再生数( )。一个新颖的输入训练特征是通过将估计的有效再生数(由 COVIDActNow.org 预先计算)与实时检测数据对齐直至最大程度相关而生成的病例预测,这有助于我们的模型更好地拟合传统模型所确定的疫情轨迹。通过动态人口流动性以及非新冠疾病的患病率和死亡率来衡量人群疾病易感性,可部分缓解 的可靠性较差的问题。该模型用于为 2020 年 11 月 1 日至 2021 年 1 月 10 日期间的 3068 个县的每个参考周生成未来 1、2、3 和 4 周的预测。在此时间段内,它保持每周每 10 万人病例数的平均绝对误差(MAE)小于 300,并且始终优于或与金标准 compartmental 模型表现相当。此外,由于其在保持良好性能和有限资源利用的同时具有更广泛训练特征集的潜力,它在集成建模中具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a962/8731233/ee65ccecde04/gr1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验