Tuan Dang Anh, Dang Tran Ngoc
Faculty of Public Health, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City 70000, Vietnam.
Trop Med Infect Dis. 2024 Oct 21;9(10):250. doi: 10.3390/tropicalmed9100250.
Dengue fever is a persistent public health issue in tropical regions, including Vietnam, where climate variability plays a crucial role in disease transmission dynamics. This study focuses on developing climate-based machine learning models to forecast dengue outbreaks in Ba Ria Vung Tau (BRVT) province, Vietnam, using meteorological data from 2003 to 2022. We utilized four predictive models-Negative Binomial Regression (NBR), Seasonal AutoRegressive Integrated Moving Average with Exogenous Regressors (SARIMAX), Extreme Gradient Boosting (XGBoost) v2.0.3, and long short-term memory (LSTM)-to predict weekly dengue incidence. Key climate variables, including temperature, humidity, precipitation, and wind speed, were integrated into these models, with lagged variables included to capture delayed climatic effects on dengue transmission. The NBR model demonstrated the best performance in terms of predictive accuracy, achieving the lowest Mean Absolute Error (MAE), compared to other models. The inclusion of lagged climate variables significantly enhanced the model's ability to predict dengue cases. Although effective in capturing seasonal trends, SARIMAX and LSTM models struggled with overfitting and failed to accurately predict short-term outbreaks. XGBoost exhibited moderate predictive power but was sensitive to overfitting, particularly without fine-tuning. Our findings confirm that climate-based machine learning models, particularly the NBR model, offer valuable tools for forecasting dengue outbreaks in BRVT. However, improving the models' ability to predict short-term peaks remains a challenge. The integration of meteorological data into early warning systems is crucial for public health authorities to plan timely and effective interventions. This research contributes to the growing body of literature on climate-based disease forecasting and underscores the need for further model refinement to address the complexities of dengue transmission in highly endemic regions.
登革热是包括越南在内的热带地区持续存在的公共卫生问题,在这些地区,气候变率在疾病传播动态中起着关键作用。本研究聚焦于利用2003年至2022年的气象数据,开发基于气候的机器学习模型,以预测越南巴地头顿省(BRVT)的登革热疫情。我们使用了四种预测模型——负二项回归(NBR)、带外生回归变量的季节性自回归积分滑动平均模型(SARIMAX)、极端梯度提升(XGBoost)v2.0.3以及长短期记忆网络(LSTM)——来预测每周的登革热发病率。关键气候变量,包括温度、湿度、降水量和风速,被纳入这些模型,并包含滞后变量以捕捉气候对登革热传播的延迟影响。与其他模型相比,NBR模型在预测准确性方面表现最佳,实现了最低的平均绝对误差(MAE)。纳入滞后气候变量显著增强了模型预测登革热病例的能力。尽管SARIMAX和LSTM模型在捕捉季节性趋势方面有效,但它们存在过拟合问题,未能准确预测短期疫情爆发。XGBoost表现出中等预测能力,但对过拟合敏感尤其是在没有微调的情况下。我们的研究结果证实,基于气候的机器学习模型,特别是NBR模型,为预测BRVT的登革热疫情提供了有价值的工具。然而,提高模型预测短期峰值的能力仍然是一个挑战。将气象数据纳入预警系统对于公共卫生当局规划及时有效的干预措施至关重要。本研究为不断增长的基于气候的疾病预测文献做出了贡献,并强调了进一步完善模型以应对高流行地区登革热传播复杂性的必要性。