Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
Int J Environ Res Public Health. 2022 Oct 19;19(20):13555. doi: 10.3390/ijerph192013555.
Efficient and accurate dengue risk prediction is an important basis for dengue prevention and control, which faces challenges, such as downloading and processing multi-source data to generate risk predictors and consuming significant time and computational resources to train and validate models locally. In this context, this study proposed a framework for dengue risk prediction by integrating big geospatial data cloud computing based on Google Earth Engine (GEE) platform and artificial intelligence modeling on the Google Colab platform. It enables defining the epidemiological calendar, delineating the predominant area of dengue transmission in cities, generating the data of risk predictors, and defining multi-date ahead prediction scenarios. We implemented the experiments based on weekly dengue cases during 2013-2020 in the Federal District and Fortaleza, Brazil to evaluate the performance of the proposed framework. Four predictors were considered, including total rainfall (R), mean temperature (T), mean relative humidity (RH), and mean normalized difference vegetation index (NDVI). Three models (i.e., random forest (RF), long-short term memory (LSTM), and LSTM with attention mechanism (LSTM-ATT)), and two modeling scenarios (i.e., modeling with or without dengue cases) were set to implement 1- to 4-week ahead predictions. A total of 24 models were built, and the results showed in general that LSTM and LSTM-ATT models outperformed RF models; modeling could benefit from using historical dengue cases as one of the predictors, and it makes the predicted curve fluctuation more stable compared with that only using climate and environmental factors; attention mechanism could further improve the performance of LSTM models. This study provides implications for future dengue risk prediction in terms of the effectiveness of GEE-based big geospatial data processing for risk predictor generation and Google Colab-based risk modeling and presents the benefits of using historical dengue data as one of the input features and the attention mechanism for LSTM modeling.
高效准确的登革热风险预测是登革热防控的重要基础,但面临着诸多挑战,例如下载和处理多源数据以生成风险预测因子,以及在本地训练和验证模型时需要消耗大量的时间和计算资源。在这种情况下,本研究提出了一种基于 Google Earth Engine (GEE) 平台的大数据地理空间云计算和 Google Colab 平台上的人工智能建模的登革热风险预测框架。该框架可以定义流行病学日历,划定城市登革热传播的主要区域,生成风险预测因子数据,并定义多日期的预测场景。我们基于巴西联邦区和福塔莱萨 2013-2020 年每周的登革热病例实施了实验,以评估所提出框架的性能。考虑了四个预测因子,包括总降雨量 (R)、平均温度 (T)、平均相对湿度 (RH) 和平均归一化差异植被指数 (NDVI)。设置了三个模型(即随机森林 (RF)、长短时记忆网络 (LSTM) 和带注意力机制的 LSTM (LSTM-ATT))和两种建模场景(即有或无登革热病例的建模)来实现 1-4 周的预测。共构建了 24 个模型,结果表明,LSTM 和 LSTM-ATT 模型总体上优于 RF 模型;使用历史登革热病例作为预测因子之一可以提高建模效果,与仅使用气候和环境因素相比,预测曲线的波动更加稳定;注意力机制可以进一步提高 LSTM 模型的性能。本研究从基于 GEE 的大数据地理空间数据处理生成风险预测因子和基于 Google Colab 的风险建模的有效性角度为未来的登革热风险预测提供了启示,并提出了使用历史登革热数据作为输入特征之一和注意力机制用于 LSTM 建模的优势。