Suppr超能文献

基于监督式机器学习模型预测上海市细粒度传播趋势

[Prediction of trends for fine-scale spread of in Shanghai Municipality based on supervised machine learning models].

作者信息

Gong Y F, Luo Z W, Feng J X, Xue J B, Guo Z Y, Jin Y J, Yu Q, Xia S, Lü S, Xu J, Li S Z

机构信息

National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), National Health Commission Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai 200025, China.

Shanghai Municipal Center for Disease Control and Prevention, China.

出版信息

Zhongguo Xue Xi Chong Bing Fang Zhi Za Zhi. 2022 Jun 16;34(3):241-251. doi: 10.16250/j.32.1374.2021247.

Abstract

OBJECTIVE

To predict the trends for fine-scale spread of based on supervised machine learning models in Shanghai Municipality, so as to provide insights into precision snail control.

METHODS

Based on 2016 snail survey data in Shanghai Municipality and climatic, geographical, vegetation and socioeconomic data relating to snail distribution, seven supervised machine learning models were created to predict the risk of snail spread in Shanghai, including decision tree, random forest, generalized boosted model, support vector machine, naive Bayes, k-nearest neighbor and C5.0. The performance of seven models for predicting snail spread was evaluated with the area under the receiver operating characteristic curve (AUC), F1-score and accuracy, and optimal models were selected to identify the environmental variables affecting snail spread and predict the areas at risk of snail spread in Shanghai Municipality.

RESULTS

Seven supervised machine learning models were successfully created to predict the risk of snail spread in Shanghai Municipality, and random forest (AUC = 0.901, F1-score = 0.840, ACC = 0.797) and generalized boosted model (AUC= 0.889, F1-score = 0.869, ACC = 0.835) showed higher predictive performance than other models. Random forest analysis showed that the three most important climatic variables contributing to snail spread in Shanghai included aridity (11.87%), ≥ 0 °C annual accumulated temperature (10.19%), moisture index (10.18%) and average annual precipitation (9.86%), the two most important vegetation variables included the vegetation index of the first quarter (8.30%) and vegetation index of the second quarter (7.69%). Snails were more likely to spread at aridity of < 0.87, ≥ 0 °C annual accumulated temperature of 5 550 to 5 675 °C, moisture index of > 39% and average annual precipitation of > 1 180 mm, and with the vegetation index of the first quarter of > 0.4 and the vegetation index of the first quarter of > 0.6. According to the water resource developments and township administrative maps, the areas at risk of snail spread were mainly predicted in 10 townships/subdistricts, covering the Xipian, Dongpian and Tainan sections of southern Shanghai.

CONCLUSIONS

Supervised machine learning models are effective to predict the risk of fine-scale snail spread and identify the environmental determinants relating to snail spread. The areas at risk of snail spread are mainly located in southwestern Songjiang District, northwestern Jinshan District and southeastern Qingpu District of Shanghai Municipality.

摘要

目的

基于监督式机器学习模型预测上海市钉螺微观尺度扩散趋势,为精准控螺提供参考。

方法

基于上海市2016年钉螺调查数据以及与钉螺分布相关的气候、地理、植被和社会经济数据,创建7种监督式机器学习模型来预测上海市钉螺扩散风险,包括决策树、随机森林、广义提升模型、支持向量机、朴素贝叶斯、k近邻和C5.0。采用受试者工作特征曲线下面积(AUC)、F1分数和准确率评估7种模型预测钉螺扩散的性能,选择最优模型识别影响钉螺扩散的环境变量,并预测上海市钉螺扩散风险区域。

结果

成功创建7种监督式机器学习模型来预测上海市钉螺扩散风险,随机森林(AUC = 0.901,F1分数 = 0.840,ACC = 0.797)和广义提升模型(AUC = 0.889,F1分数 = 0.869,ACC = 0.835)的预测性能高于其他模型。随机森林分析表明,对上海市钉螺扩散贡献最大的3个气候变量包括干燥度(11.87%)、≥0℃年积温(10.19%)、湿润指数(10.18%)和年平均降水量(9.86%),2个最重要的植被变量包括第一季度植被指数(8.30%)和第二季度植被指数(7.69%)。钉螺在干燥度<0.87、≥0℃年积温为5550至5675℃、湿润指数>39%、年平均降水量>1180毫米,以及第一季度植被指数>0.4和第二季度植被指数>0.6的情况下更易扩散。根据水资源开发和乡镇行政区划图,主要预测了10个乡镇/街道存在钉螺扩散风险,覆盖上海市南部的西片、东片和台南地区。

结论

监督式机器学习模型可有效预测钉螺微观尺度扩散风险并识别与钉螺扩散相关的环境决定因素。上海市钉螺扩散风险区域主要位于松江区西南部、金山区西北部和青浦区东南部。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验