European Space Agency, Climate Office, ECSAT, Harwell OX11 0FD, UK.
Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth PL1 3DH, UK.
Int J Environ Res Public Health. 2020 Dec 15;17(24):9378. doi: 10.3390/ijerph17249378.
Oceanic and coastal ecosystems have undergone complex environmental changes in recent years, amid a context of climate change. These changes are also reflected in the dynamics of water-borne diseases as some of the causative agents of these illnesses are ubiquitous in the aquatic environment and their survival rates are impacted by changes in climatic conditions. Previous studies have established strong relationships between essential climate variables and the coastal distribution and seasonal dynamics of the bacteria , pathogenic types of which are responsible for human cholera disease. In this study we provide a novel exploration of the potential of a machine learning approach to forecast environmental cholera risk in coastal India, home to more than 200 million inhabitants, utilising atmospheric, terrestrial and oceanic satellite-derived essential climate variables. A Random Forest classifier model is developed, trained and tested on a cholera outbreak dataset over the period 2010-2018 for districts along coastal India. The random forest classifier model has an Accuracy of 0.99, an F1 Score of 0.942 and a Sensitivity score of 0.895, meaning that 89.5% of outbreaks are correctly identified. Spatio-temporal patterns emerged in terms of the model's performance based on seasons and coastal locations. Further analysis of the specific contribution of each Essential Climate Variable to the model outputs shows that chlorophyll-a concentration, sea surface salinity and land surface temperature are the strongest predictors of the cholera outbreaks in the dataset used. The study reveals promising potential of the use of random forest classifiers and remotely-sensed essential climate variables for the development of environmental cholera-risk applications. Further exploration of the present random forest model and associated essential climate variables is encouraged on cholera surveillance datasets in other coastal areas affected by the disease to determine the model's transferability potential and applicative value for cholera forecasting systems.
近年来,在气候变化的背景下,海洋和沿海生态系统发生了复杂的环境变化。这些变化也反映在水传播疾病的动态中,因为这些疾病的一些病原体在水生环境中无处不在,它们的存活率受到气候条件变化的影响。先前的研究已经确定了关键气候变量与细菌在沿海地区的分布和季节性动态之间的强相关性,其中一些致病性类型是人类霍乱病的病原体。在这项研究中,我们利用大气、陆地和海洋卫星衍生的关键气候变量,对机器学习方法在印度沿海地区预测环境霍乱风险的潜力进行了新的探索。印度沿海地区居住着超过 2 亿居民,我们为该地区开发、训练和测试了一个基于随机森林分类器模型的霍乱爆发数据集。该随机森林分类器模型在 2010 年至 2018 年期间对印度沿海地区的霍乱爆发数据集进行训练和测试,其准确率为 0.99,F1 得分为 0.942,敏感性得分为 0.895,这意味着 89.5%的爆发得到了正确识别。基于季节和沿海位置,模型的性能出现了时空模式。进一步分析每个关键气候变量对模型输出的具体贡献表明,叶绿素-a 浓度、海水盐度和陆地表面温度是数据集内霍乱爆发的最强预测因子。该研究揭示了使用随机森林分类器和遥感关键气候变量开发环境霍乱风险应用的有前途的潜力。鼓励在受疾病影响的其他沿海地区的霍乱监测数据集中进一步探索本随机森林模型和相关关键气候变量,以确定模型的可转移性潜力和在霍乱预测系统中的应用价值。