Mollalo Abolfazl, Sadeghian Ali, Israel Glenn D, Rashidi Parisa, Sofizadeh Aioub, Glass Gregory E
Department of Geography, University of Florida, Gainesville, FL, USA.
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA.
Acta Trop. 2018 Dec;188:187-194. doi: 10.1016/j.actatropica.2018.09.004. Epub 2018 Sep 7.
The distribution and abundance of Phlebotomus papatasi, the primary vector of zoonotic cutaneous leishmaniasis in most semi-/arid countries, is a major public health challenge. This study compares several approaches to model the spatial distribution of the species in an endemic region of the disease in Golestan province, northeast of Iran. The intent is to assist decision makers for targeted interventions. We developed a geo-database of the collected Phlebotominae sand flies from different parts of the study region. Sticky paper traps coated with castor oil were used to collect sand flies. In 44 out of 142 sampling sites, Ph. papatasi was present. We also gathered and prepared data on related environmental factors including topography, weather variables, distance to main rivers and remotely sensed data such as normalized difference vegetation cover and land surface temperature (LST) in a GIS framework. Applicability of three classifiers: (vanilla) logistic regression, random forest and support vector machine (SVM) were compared for predicting presence/absence of the vector. Predictive performances were compared using an independent dataset to generate area under the ROC curve (AUC) and Kappa statistics. All three models successfully predicted the presence/absence of the vector, however, the SVM classifier (Accuracy = 0.906, AUC = 0.974, Kappa = 0.876) outperformed the other classifiers on predicting accuracy. Moreover, this classifier was the most sensitive (85%), and the most specific (93%) model. Sensitivity analysis of the most accurate model (i.e. SVM) revealed that slope, nighttime LST in October and mean temperature of the wettest quarter were among the most important predictors. The findings suggest that machine learning techniques, especially the SVM classifier, when coupled with GIS and remote sensing data can be a useful and cost-effective way for identifying habitat suitability of the species.
在大多数半干旱/干旱国家,动物源性皮肤利什曼病的主要传播媒介巴氏白蛉的分布和数量是一项重大的公共卫生挑战。本研究比较了几种方法,以模拟伊朗东北部戈勒斯坦省该疾病流行地区该物种的空间分布。目的是协助决策者进行有针对性的干预。我们建立了一个地理数据库,收录了研究区域不同地点采集的白蛉亚科沙蝇。使用涂有蓖麻油的粘纸诱捕器来收集沙蝇。在142个采样点中的44个点发现了巴氏白蛉。我们还在地理信息系统框架下收集并整理了相关环境因素的数据,包括地形、天气变量、到主要河流的距离以及归一化植被指数和地表温度(LST)等遥感数据。比较了三种分类器(普通逻辑回归、随机森林和支持向量机(SVM))预测该传播媒介存在与否的适用性。使用独立数据集比较预测性能,以生成ROC曲线下面积(AUC)和Kappa统计量。所有三种模型都成功预测了该传播媒介的存在与否,然而,支持向量机分类器(准确率 = 0.906,AUC = 0.974,Kappa = 0.876)在预测准确性方面优于其他分类器。此外,该分类器是最敏感(85%)和最特异(93%)的模型。对最准确模型(即支持向量机)的敏感性分析表明,坡度、10月夜间地表温度和最湿润季度的平均温度是最重要的预测因子。研究结果表明,机器学习技术,尤其是支持向量机分类器,与地理信息系统和遥感数据相结合时,可能是一种识别该物种栖息地适宜性的有用且具有成本效益的方法。