Department of Vector Ecology and Environment, Institute of Tropical Medicine (NEKKEN), Nagasaki University, 1-12-4, Sakamoto, Nagasaki City, 852-8523, Japan.
Graduate School of Biomedical Sciences, Nagasaki University, Nagasaki City, Japan.
Sci Rep. 2023 Dec 28;13(1):23091. doi: 10.1038/s41598-023-50176-3.
Climatic factors influence malaria transmission via the effect on the Anopheles vector and Plasmodium parasite. Modelling and understanding the complex effects that climate has on malaria incidence can enable important early warning capabilities. Deep learning applications across fields are proving valuable, however the field of epidemiological forecasting is still in its infancy with a lack of applied deep learning studies for malaria in southern Africa which leverage quality datasets. Using a novel high resolution malaria incidence dataset containing 23 years of daily data from 1998 to 2021, a statistical model and XGBOOST machine learning model were compared to a deep learning Transformer model by assessing the accuracy of their numerical predictions. A novel loss function, used to account for the variable nature of the data yielded performance around + 20% compared to the standard MSE loss. When numerical predictions were converted to alert thresholds to mimic use in a real-world setting, the Transformer's performance of 80% according to AUROC was 20-40% higher than the statistical and XGBOOST models and it had the highest overall accuracy of 98%. The Transformer performed consistently with increased accuracy as more climate variables were used, indicating further potential for this prediction framework to predict malaria incidence at a daily level using climate data for southern Africa.
气候因素通过对疟蚊媒介和疟原虫寄生虫的影响来影响疟疾传播。对气候对疟疾发病率的复杂影响进行建模和理解,可以实现重要的早期预警能力。深度学习在各个领域的应用已经证明了其价值,但是流行病学预测领域仍处于起步阶段,缺乏利用南部非洲高质量数据集的疟疾应用深度学习研究。本研究使用了一个新颖的高分辨率疟疾发病率数据集,该数据集包含了 1998 年至 2021 年 23 年的每日数据,通过评估其数值预测的准确性,比较了统计模型和 XGBOOST 机器学习模型与深度学习 Transformer 模型。使用一种新颖的损失函数,用于说明数据的可变性,与标准均方误差损失相比,性能提高了约 20%。当将数值预测转换为警报阈值以模拟在实际环境中的使用时,Transformer 的 AUROC 为 80%,比统计和 XGBOOST 模型高 20-40%,其整体准确率最高,为 98%。随着使用的气候变量的增加,Transformer 的性能也越来越准确,这表明该预测框架在使用南部非洲气候数据预测每日疟疾发病率方面具有进一步的潜力。