Suppr超能文献

用于计算新冠病毒易感性指数的多维机器学习模型

Multidimensional Machine Learning Model to Calculate a COVID-19 Vulnerability Index.

作者信息

Rosero Perez Paula Andrea, Realpe Gonzalez Juan Sebastián, Salazar-Cabrera Ricardo, Restrepo David, López Diego M, Blobel Bernd

机构信息

Research Group in Telematics Engineering, Telematics Department, Universidad del Cauca, Popayán 190002, Colombia.

Medical Faculty, University of Regensburg, 93053 Regensburg, Germany.

出版信息

J Pers Med. 2023 Jul 15;13(7):1141. doi: 10.3390/jpm13071141.

Abstract

In Colombia, the first case of COVID-19 was confirmed on 6 March 2020. On 13 March 2023, Colombia registered 6,360,780 confirmed positive cases of COVID-19, representing 12.18% of the total population. The National Administrative Department of Statistics (DANE) in Colombia published in 2020 a COVID-19 vulnerability index, which estimates the vulnerability (per city block) of being infected with COVID-19. Unfortunately, DANE did not consider multiple factors that could increase the risk of COVID-19 (in addition to demographic and health), such as environmental and mobility data (found in the related literature). The proposed multidimensional index considers variables of different types (unemployment rate, gross domestic product, citizens' mobility, vaccination data, and climatological and spatial information) in which the incidence of COVID-19 is calculated and compared with the incidence of the COVID-19 vulnerability index provided by DANE. The collection, data preparation, modeling, and evaluation phases of the Cross-Industry Standard Process for Data Mining methodology (CRISP-DM) were considered for constructing the index. The multidimensional index was evaluated using multiple machine learning models to calculate the incidence of COVID-19 cases in the main cities of Colombia. The results showed that the best-performing model to predict the incidence of COVID-19 in Colombia is the Extra Trees Regressor algorithm, obtaining an R-squared of 0.829. This work is the first step toward a multidimensional analysis of COVID-19 risk factors, which has the potential to support decision making in public health programs. The results are also relevant for calculating vulnerability indexes for other viral diseases, such as dengue.

摘要

2020年3月6日,哥伦比亚确诊首例新冠病毒病病例。2023年3月13日,哥伦比亚累计确诊新冠病毒病阳性病例6360780例,占总人口的12.18%。哥伦比亚国家统计局(DANE)于2020年发布了新冠病毒病脆弱性指数,该指数用于估计(每个城市街区)感染新冠病毒病的脆弱性。遗憾的是,DANE并未考虑可能增加新冠病毒病风险的多种因素(除人口统计学和健康因素外),如环境和流动性数据(相关文献中有提及)。本文提出的多维指数考虑了不同类型的变量(失业率、国内生产总值、居民流动性、疫苗接种数据以及气候和空间信息),据此计算新冠病毒病发病率,并与DANE提供的新冠病毒病脆弱性指数发病率进行比较。构建该指数时采用了跨行业数据挖掘标准流程(CRISP-DM)中的数据收集、数据准备、建模和评估阶段。使用多种机器学习模型对多维指数进行评估,以计算哥伦比亚主要城市的新冠病毒病病例发病率。结果表明,预测哥伦比亚新冠病毒病发病率表现最佳的模型是极端随机树回归算法,决定系数R²为0.829。这项工作是对新冠病毒病风险因素进行多维分析的第一步,有望为公共卫生项目的决策提供支持。研究结果对于计算登革热等其他病毒性疾病的脆弱性指数也具有参考价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f7f/10381838/3cf1734f6943/jpm-13-01141-g0A1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验