Rahman M S, Pientong Chamsai, Zafar Sumaira, Ekalaksananan Tipaya, Paul Richard E, Haque Ubydul, Rocklöv Joacim, Overgaard Hans J
Department of Microbiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand.
Department of Statistics, Begum Rokeya University, Rangpur, Rangpur-5404, Bangladesh.
One Health. 2021 Dec 4;13:100358. doi: 10.1016/j.onehlt.2021.100358. eCollection 2021 Dec.
Mapping the spatial distribution of the dengue vector and accurately predicting its abundance are crucial for designing effective vector control strategies and early warning tools for dengue epidemic prevention. Socio-ecological and landscape factors influence abundance. Therefore, we aimed to map the spatial distribution of female adult and predict its abundance in northeastern Thailand based on socioeconomic, climate change, and dengue knowledge, attitude and practices (KAP) and/or landscape factors using machine learning (ML)-based system.
A total of 1066 females adult were collected from four villages in northeastern Thailand during January-December 2019. Information on household socioeconomics, KAP regarding climate change and dengue, and satellite-based landscape data were also acquired. Geographic information systems (GIS) were used to map the household-based spatial distribution of female adult abundance (high/low). Five popular supervised learning models, logistic regression (LR), support vector machine (SVM), k-nearest neighbor (kNN), artificial neural network (ANN), and random forest (RF), were used to predict females adult abundance (high/low). The predictive accuracy of each modeling technique was calculated and evaluated. Important variables for predicting female adult abundance were also identified using the best-fitted model.
Urban areas had higher abundance of female adult compared to rural areas Overall, study respondents in both urban and rural areas had inadequate KAP regarding climate change and dengue. The average landscape factors per household in urban areas were rice crop (47.4%), natural tree cover (17.8%), built-up area (13.2%), permanent wetlands (21.2%), and rubber plantation (0%), and the corresponding figures for rural areas were 12.1, 2.0, 38.7, 40.1 and 0.1% respectively. Among all assessed models, RF showed the best prediction performance (socioeconomics: area under curve, AUC = 0.93, classification accuracy, CA = 0.86, F1 score = 0.85; KAP: AUC = 0.95, CA = 0.92, F1 = 0.90; landscape: AUC = 0.96, CA = 0.89, F1 = 0.87) for female adult abundance. The combined influences of all factors further improved the predictive accuracy in RF model (socioeconomics + KAP + landscape: AUC = 0.99, CA = 0.96 and F1 = 0.95). Dengue prevention practices were shown to be the most important predictor in the RF model for female adult abundance in northeastern Thailand.
The RF model is more suitable for the prediction of abundance in northeastern Thailand. Our study exemplifies that the application of GIS and machine learning systems has significant potential for understanding the spatial distribution of dengue vectors and predicting its abundance. The study findings might help optimize vector control strategies, future mosquito suppression, prediction and control strategies of epidemic arboviral diseases (dengue, chikungunya, and Zika). Such strategies can be incorporated into One Health approaches applying transdisciplinary approaches considering human-vector and agro-environmental interrelationships.
绘制登革热媒介的空间分布并准确预测其数量,对于设计有效的媒介控制策略和登革热疫情预防预警工具至关重要。社会生态和景观因素会影响其数量。因此,我们旨在利用基于机器学习(ML)的系统,根据社会经济、气候变化、登革热知识、态度和行为(KAP)以及/或景观因素,绘制泰国东北部成年雌性登革热媒介的空间分布并预测其数量。
2019年1月至12月期间,从泰国东北部的四个村庄共收集了1066只成年雌性登革热媒介。还获取了家庭社会经济信息、关于气候变化和登革热的KAP以及基于卫星的景观数据。利用地理信息系统(GIS)绘制基于家庭的成年雌性登革热媒介数量(高/低)的空间分布。使用五种流行的监督学习模型,即逻辑回归(LR)、支持向量机(SVM)、k近邻(kNN)、人工神经网络(ANN)和随机森林(RF),来预测成年雌性登革热媒介数量(高/低)。计算并评估了每种建模技术的预测准确性。还使用最佳拟合模型确定了预测成年雌性登革热媒介数量的重要变量。
与农村地区相比,城市地区成年雌性登革热媒介数量更多。总体而言,城乡地区的研究受访者对气候变化和登革热的KAP均不足。城市地区每户的平均景观因素为稻田(47.4%)、天然树木覆盖(17.8%)、建成区(13.2%)、永久性湿地(21.2%)和橡胶种植园(0%),农村地区的相应数字分别为12.1%、2.0%、38.7%、40.1%和0.1%。在所有评估模型中,RF对成年雌性登革热媒介数量的预测性能最佳(社会经济因素:曲线下面积,AUC = 0.93,分类准确率,CA = 0.86,F1分数 = 0.85;KAP:AUC = 0.95,CA = 0.92,F1 = 0.90;景观因素:AUC = 0.96,CA = 0.89,F1 = 0.87)。所有因素的综合影响进一步提高了RF模型的预测准确性(社会经济因素 + KAP + 景观因素:AUC = 0.99,CA = 0.96,F1 = 0.95)。在泰国东北部,登革热预防措施被证明是RF模型中成年雌性登革热媒介数量最重要的预测因素。
RF模型更适合预测泰国东北部登革热媒介数量。我们的研究表明,GIS和机器学习系统的应用在理解登革热媒介的空间分布和预测其数量方面具有巨大潜力。研究结果可能有助于优化媒介控制策略、未来的蚊虫抑制以及流行性虫媒病毒病(登革热、基孔肯雅热和寨卡病毒病)的预测和控制策略。此类策略可纳入“同一健康”方法,采用跨学科方法考虑人与媒介以及农业环境的相互关系。