School of Public Health, Lanzhou University, Gansu, China.
Gansu Provincial Center for Disease Control and Prevention, Gansu, China.
Environ Sci Pollut Res Int. 2023 Jan;30(4):9962-9973. doi: 10.1007/s11356-022-22831-1. Epub 2022 Sep 6.
This paper aims to study the cumulative lag effect of meteorological factors on brucellosis incidence and the prediction performance based on Random Forest model. The monthly number of brucellosis cases and meteorological data from 2015 to 2019 in Yongchang of Gansu Province, northwest China, were used to build distributed lag nonlinear model (DLNM). The number of brucellosis cases of lag 1 month and meteorological data from 2015 to 2018 were used to build RF model to predict the brucellosis incidence in 2019. Meanwhile, SARIMA model was established to compare the prediction performance with RF model according to R and RMSE. The results indicated that the population had a high incidence risk at temperature between 5 and 13 °C and lag between 0 and 18 days, sunshine duration between 225 and 260 h and lag between 0 and 1 month, and atmosphere pressure between 789 and 793.5 hPa and lag between 0 and 18 days. The R and RMSE of train set and test set in RF model were 0.903, 1.609, 0.824, and 2.657, respectively, and the R and RMSE in SARIMA model were 0.530 and 7.008. This study found significant nonlinear and lag associations between meteorological factors and brucellosis incidence. The prediction performance of RF model was more accurate and practical compared with SARIMA model.
本研究旨在探讨气象因素对布鲁氏菌病发病率的累积滞后效应及其基于随机森林模型的预测性能。采用中国西北甘肃省永昌县 2015-2019 年月布鲁氏菌病发病数据和气象资料,构建分布滞后非线性模型(DLNM)。采用 2015-2018 年月布鲁氏菌病发病数和气象资料,建立 RF 模型,预测 2019 年布鲁氏菌病发病情况。同时,根据 R 和 RMSE 值,建立 SARIMA 模型与 RF 模型进行预测性能比较。结果表明,在温度为 5-13°C、滞后时间为 0-18 天,日照时间为 225-260h、滞后时间为 0-1 个月,大气压为 789-793.5hPa、滞后时间为 0-18 天时,人群布鲁氏菌病发病风险较高。RF 模型的训练集和测试集的 R 和 RMSE 值分别为 0.903、1.609、0.824 和 2.657,SARIMA 模型的 R 和 RMSE 值分别为 0.530 和 7.008。本研究发现气象因素与布鲁氏菌病发病率之间存在显著的非线性和滞后关系。与 SARIMA 模型相比,RF 模型的预测性能更为准确和实用。