Suppr超能文献

利用可解释人工智能实现从局部解释到树木的全局理解

From Local Explanations to Global Understanding with Explainable AI for Trees.

作者信息

Lundberg Scott M, Erion Gabriel, Chen Hugh, DeGrave Alex, Prutkin Jordan M, Nair Bala, Katz Ronit, Himmelfarb Jonathan, Bansal Nisha, Lee Su-In

机构信息

Microsoft Research.

Paul G. Allen School of Computer Science and Engineering, University of Washington.

出版信息

Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.

Abstract

Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are popular non-linear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here, we improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model's performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains.

摘要

基于树的机器学习模型,如随机森林、决策树和梯度提升树,是流行的非线性预测模型,但相对而言,人们对解释它们的预测关注较少。在这里,我们通过三个主要贡献提高了基于树的模型的可解释性:1)基于博弈论计算最优解释的第一个多项式时间算法。2)一种直接测量局部特征交互效应的新型解释。3)一套基于组合每个预测的多个局部解释来理解全局模型结构的新工具。我们将这些工具应用于三个医学机器学习问题,并展示了如何通过组合多个高质量的局部解释来表示全局结构,同时保持对原始模型的局部忠实性。这些工具使我们能够:i)识别美国人群中高幅度但低频的非线性死亡风险因素;ii)突出具有共同风险特征的不同人群亚组;iii)识别慢性肾病风险因素之间的非线性交互效应;iv)通过识别哪些特征随着时间的推移正在降低模型性能来监测部署在医院中的机器学习模型。鉴于基于树的机器学习模型的流行,这些对其可解释性的改进在广泛的领域中都有影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a29e/7326367/c539f4fb84be/nihms-1601475-f0004.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验