Suppr超能文献

量化变量重要性的方法:对嘈杂生态数据的分析启示

Methods to quantify variable importance: implications for the analysis of noisy ecological data.

作者信息

Murray Kim, Conner Mary M

机构信息

Snow Leopard Trust, Seattle, Washington 98103, USA.

出版信息

Ecology. 2009 Feb;90(2):348-55. doi: 10.1890/07-1929.1.

Abstract

Determining the importance of independent variables is of practical relevance to ecologists and managers concerned with allocating limited resources to the management of natural systems. Although techniques that identify explanatory variables having the largest influence on the response variable are needed to design management actions effectively, the use of various indices to evaluate variable importance is poorly understood. Using Monte Carlo simulations, we compared six different indices commonly used to evaluate variable importance; zero-order correlations, partial correlations, semipartial correlations, standardized regression coefficients, Akaike weights, and independent effects. We simulated four scenarios to evaluate the indices under progressively more complex circumstances that included correlation between explanatory variables, as well as a spurious variable that was correlated with other explanatory variables, but not with the dependent variable. No index performed perfectly under all circumstances, but partial correlations and Akaike weights performed poorly in all cases. Zero-order correlations was the only measure that detected the presence of a spurious variable, whereas only independent effects assigned overlap areas correctly once the spurious variable was removed. We therefore recommend using zero-order correlations to eliminate predictor variables with correlations near zero, followed by the use of independent effects to assign overlap areas and rank variable importance.

摘要

确定自变量的重要性对于关注将有限资源分配给自然系统管理的生态学家和管理者具有实际意义。尽管有效地设计管理行动需要能够识别对响应变量影响最大的解释变量的技术,但人们对使用各种指标来评估变量重要性的理解却很有限。通过蒙特卡洛模拟,我们比较了六种常用于评估变量重要性的不同指标:零阶相关、偏相关、半偏相关、标准化回归系数、赤池权重和独立效应。我们模拟了四种情景,以在包括解释变量之间的相关性以及与其他解释变量相关但与因变量不相关的虚假变量的日益复杂的情况下评估这些指标。没有一个指标在所有情况下都表现完美,但偏相关和赤池权重在所有情况下表现都很差。零阶相关是唯一能检测到虚假变量存在的指标,而一旦去除虚假变量,只有独立效应能正确分配重叠区域。因此,我们建议先用零阶相关来剔除相关性接近零的预测变量,然后使用独立效应来分配重叠区域并对变量重要性进行排序。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验