韩国传染病感染风险的精细尺度空间预测。

Fine-Scale Spatial Prediction on the Risk of Infection in the Republic of Korea.

机构信息

College of Veterinary Medicine, Chungbuk National University, Cheongju, Korea.

Division of Infectious Diseases, Department of Internal Medicine, College of Medicine, Soonchunhyang University, Asan, Korea.

出版信息

J Korean Med Sci. 2024 Jun 10;39(22):e176. doi: 10.3346/jkms.2024.39.e176.

DOI:10.3346/jkms.2024.39.e176

PMID:38859739

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11164649/

Abstract

BACKGROUND

Malaria elimination strategies in the Republic of Korea (ROK) have decreased malaria incidence but face challenges due to delayed case detection and response. To improve this, machine learning models for predicting malaria, focusing on high-risk areas, have been developed.

METHODS

The study targeted the northern region of ROK, near the demilitarized zone, using a 1-km grid to identify areas for prediction. Grid cells without residential buildings were excluded, leaving 8,425 cells. The prediction was based on whether at least one malaria case was reported in each grid cell per month, using spatial data of patient locations. Four algorithms were used: gradient boosted (GBM), generalized linear (GLM), extreme gradient boosted (XGB), and ensemble models, incorporating environmental, sociodemographic, and meteorological data as predictors. The models were trained with data from May to October (2019-2021) and tested with data from May to October 2022. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC).

RESULTS

The AUROC of the prediction models performed excellently (GBM = 0.9243, GLM = 0.9060, XGB = 0.9180, and ensemble model = 0.9301). Previous malaria risk, population size, and meteorological factors influenced the model most in GBM and XGB.

CONCLUSION

Machine-learning models with properly preprocessed malaria case data can provide reliable predictions. Additional predictors, such as mosquito density, should be included in future studies to improve the performance of models.

摘要

背景

韩国（ROK）的疟疾消除策略降低了疟疾发病率，但由于病例检测和应对延迟，仍面临挑战。为了改善这一状况，已经开发了针对高风险地区的疟疾预测机器学习模型。

方法

该研究针对 ROK 北部靠近非军事区的地区，使用 1 公里的网格来识别预测区域。排除没有住宅建筑的网格单元，留下 8425 个单元。预测是基于每个网格单元每月是否至少报告了一例疟疾病例，使用患者位置的空间数据。使用了四种算法：梯度提升（GBM）、广义线性（GLM）、极端梯度提升（XGB）和集成模型，将环境、社会人口统计学和气象数据作为预测因子。模型使用 2019 年至 2021 年 5 月至 10 月的数据进行训练，并使用 2022 年 5 月至 10 月的数据进行测试。使用接收者操作特征曲线下的面积（AUROC）评估模型性能。