Suppr超能文献

通过结合人口特征和社会 distancing 政策预测县级 COVID-19 病例数趋势。 注:原文中“social distancing”直译为“社会距离”,在疫情语境下常指社交疏离等防控措施,这里可能是表述不完整,准确意思可能是“社交疏离政策” 。

Predicting COVID-19 county-level case number trend by combining demographic characteristics and social distancing policies.

作者信息

Li Megan Mun, Pham Anh, Kuo Tsung-Ting

机构信息

Department of Biology, University of California San Diego, La Jolla, California, USA.

UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA.

出版信息

JAMIA Open. 2022 Jun 25;5(3):ooac056. doi: 10.1093/jamiaopen/ooac056. eCollection 2022 Oct.

Abstract

OBJECTIVE

Predicting daily trends in the Coronavirus Disease 2019 (COVID-19) case number is important to support individual decisions in taking preventative measures. This study aims to use COVID-19 case number history, demographic characteristics, and social distancing policies both independently/interdependently to predict the daily trend in the rise or fall of county-level cases.

MATERIALS AND METHODS

We extracted 2093 features (5 from the US COVID-19 case number history, 1824 from the demographic characteristics independently/interdependently, and 264 from the social distancing policies independently/interdependently) for 3142 US counties. Using the top selected 200 features, we built 4 machine learning models: Logistic Regression, Naïve Bayes, Multi-Layer Perceptron, and Random Forest, along with 4 Ensemble methods: Average, Product, Minimum, and Maximum, and compared their performances.

RESULTS

The Ensemble Average method had the highest area-under the receiver operator characteristic curve (AUC) of 0.692. The top ranked features were all interdependent features.

CONCLUSION

The findings of this study suggest the predictive power of diverse features, especially when combined, in predicting county-level trends of COVID-19 cases and can be helpful to individuals in making their daily decisions. Our results may guide future studies to consider more features interdependently from conventionally distinct data sources in county-level predictive models. Our code is available at: https://doi.org/10.5281/zenodo.6332944.

摘要

目的

预测2019年冠状病毒病(COVID-19)病例数的每日趋势对于支持个人采取预防措施的决策非常重要。本研究旨在独立/相互依赖地使用COVID-19病例数历史、人口特征和社会 distancing政策来预测县级病例上升或下降的每日趋势。

材料和方法

我们为3142个美国县提取了2093个特征(5个来自美国COVID-19病例数历史,1824个独立/相互依赖地来自人口特征,264个独立/相互依赖地来自社会 distancing政策)。使用精选的前200个特征,我们构建了4种机器学习模型:逻辑回归、朴素贝叶斯、多层感知器和随机森林,以及4种集成方法:平均、乘积、最小和最大,并比较了它们的性能。

结果

集成平均方法的接收器操作特征曲线(AUC)下面积最高,为0.692。排名靠前的特征都是相互依赖的特征。

结论

本研究结果表明,多种特征,尤其是组合时,在预测县级COVID-19病例趋势方面具有预测能力,有助于个人做出日常决策。我们的结果可能会指导未来的研究在县级预测模型中从传统上不同的数据来源更相互依赖地考虑更多特征。我们的代码可在以下网址获取:https://doi.org/10.5281/zenodo.6332944

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8257/9278037/1836592e3bb5/ooac056f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验