Suppr超能文献

一种支持生态系统管理的可解释机器学习方法:在淡水大型无脊椎动物物种分布模型中的应用。

An interpretable machine learning method for supporting ecosystem management: Application to species distribution models of freshwater macroinvertebrates.

机构信息

School of Environmental Engineering, University of Seoul, Dongdaemun-gu, Seoul, 02504, Republic of Korea.

School of Environmental Engineering, University of Seoul, Dongdaemun-gu, Seoul, 02504, Republic of Korea.

出版信息

J Environ Manage. 2021 Aug 1;291:112719. doi: 10.1016/j.jenvman.2021.112719. Epub 2021 May 1.

Abstract

Species distribution models (SDMs), in which species occurrences are related to a suite of environmental variables, have been used as a decision-making tool in ecosystem management. Complex machine learning (ML) algorithms that lack interpretability may hinder the use of SDMs for ecological explanations, possibly limiting the role of SDMs as a decision-support tool. To meet the growing demand of explainable MLs, several interpretable ML methods have recently been proposed. Among these methods, SHaply Additive exPlanation (SHAP) has drawn attention for its robust theoretical justification and analytical gains. In this study, the utility of SHAP was demonstrated by the application of SDMs of four benthic macroinvertebrate species. In addition to species responses, the dataset contained 22 environmental variables monitored at 436 sites across five major rivers of South Korea. A range of ML algorithms was employed for model development. Each ML model was trained and optimized using 10-fold cross-validation. Model evaluation based on the test dataset indicated strong model performance, with an accuracy of ≥0.7 in all evaluation metrics for all MLs and species. However, only the random forest algorithm showed a behavior consistent with the known ecology of the investigated species. SHAP presents an integrated framework in which local interpretations that incorporate local interaction effects are combined to represent the global model structure. Consequently, this framework offered a novel opportunity to assess the importance of variables in predicting species occurrence, not only across sites, but also for individual sites. Furthermore, removing interaction effects from variable importance values (SHAP values) clearly revealed non-linear species responses to variations in environmental variables, indicating the existence of ecological thresholds. This study provides guidelines for the use of a new interpretable method supporting ecosystem management.

摘要

物种分布模型 (SDM) 将物种出现与一系列环境变量相关联,已被用作生态系统管理的决策工具。缺乏可解释性的复杂机器学习 (ML) 算法可能会阻碍 SDM 在生态解释中的使用,从而可能限制 SDM 作为决策支持工具的作用。为了满足对可解释 ML 的需求,最近提出了几种可解释的 ML 方法。在这些方法中,基于 Shapley 值的可解释性方法 (SHAP) 因其稳健的理论依据和分析优势而受到关注。在这项研究中,通过应用四个底栖大型无脊椎动物物种的 SDM 展示了 SHAP 的实用性。除了物种反应外,该数据集还包含在韩国五条主要河流的 436 个站点监测的 22 个环境变量。采用了一系列 ML 算法来开发模型。每个 ML 模型都使用 10 折交叉验证进行训练和优化。基于测试数据集的模型评估表明,所有 ML 和物种的所有评估指标的准确率均≥0.7,模型性能均较强。然而,只有随机森林算法表现出与所研究物种的已知生态学一致的行为。SHAP 提供了一个集成框架,其中包含局部交互效应的局部解释被组合在一起,以表示全局模型结构。因此,该框架为评估变量在预测物种出现方面的重要性提供了一个新的机会,不仅可以跨站点进行评估,还可以对单个站点进行评估。此外,从变量重要性值(SHAP 值)中去除交互效应,清楚地揭示了环境变量变化对物种反应的非线性,表明存在生态阈值。本研究为使用支持生态系统管理的新可解释方法提供了指导。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验