Yimam Abdurohman, Mekuriaw Asnake, Assefa Dessie, Bewket Woldeamlak
Department of Geography and Environmental Studies, Addis Ababa University, Addis Ababa, Ethiopia.
Department of Geography and Environmental Studies, Debre Birhan University, Debre Birhan, Ethiopia.
Heliyon. 2024 Sep 25;10(19):e38419. doi: 10.1016/j.heliyon.2024.e38419. eCollection 2024 Oct 15.
plantations are widespread in the highlands of northern Ethiopia. The species has been used for centuries for various purposes. However, there are controversies surrounding the species with excessive soil nutrient and water consumption. Modelling the spatial distribution of the species is fundamental to understand its ecological and hydrological effects in the region for policy inputs. Therefore, the purpose of this study is to develop a model for mapping the spatial distribution of . We used the spectral bands of Sentinel-2 data, vegetation indices, and environmental data as predictor variables and three machine learning algorithms (Random Forest, Support Vector Machine, and Boosted Regression Trees) to model the current distribution of . Eleven of the twenty-five predictor variables were filtered using a variance inflation factor (VIF). 419 in situ georeferenced data points were used for training, and validating the models. The area under the curve (AUC), kappa statistic (K), true skill statistic (TSS), Root Mean Squared Error and coefficient of determination (R) were used to validate the models' performance. The model validation metrics confirmed the highest performance of Random Forest. The prediction map of Random Forest revealed that was fairly detected in non- woody vegetation (R = 0.86, P < 0.001; RMSE = 0.31). We found that the Green Normalized Difference Vegetation Index and environmental variables, such as elevation and distance from the road, were the most important predictor variables in explaining the distribution of . Our findings demonstrate that machine learning algorithms with Sentinel-2 spectral bands and vegetation indices compounded with environmental data can effectively model the spatial distribution of .
种植园在埃塞俄比亚北部高地广泛分布。该物种已被用于各种目的达数百年之久。然而,围绕该物种存在争议,因其过度消耗土壤养分和水资源。对该物种的空间分布进行建模对于了解其在该地区的生态和水文影响以提供政策依据至关重要。因此,本研究的目的是开发一个用于绘制[物种名称]空间分布的模型。我们使用哨兵 - 2 数据的光谱波段、植被指数和环境数据作为预测变量,并采用三种机器学习算法(随机森林、支持向量机和提升回归树)来模拟[物种名称]的当前分布。使用方差膨胀因子(VIF)对 25 个预测变量中的 11 个进行了筛选。419 个原地地理参考数据点用于训练和验证模型。使用曲线下面积(AUC)、卡帕统计量(K)、真技能统计量(TSS)、均方根误差和决定系数(R)来验证模型的性能。模型验证指标证实随机森林的性能最高。随机森林的预测图显示,在非木本植被中能较好地检测到[物种名称](R = 0.86,P < 0.001;RMSE = 0.31)。我们发现绿色归一化差异植被指数和环境变量,如海拔和距道路的距离,是解释[物种名称]分布的最重要预测变量。我们的研究结果表明,结合环境数据的哨兵 - 2 光谱波段和植被指数的机器学习算法能够有效地模拟[物种名称]的空间分布。