Lundberg Scott M, Erion Gabriel, Chen Hugh, DeGrave Alex, Prutkin Jordan M, Nair Bala, Katz Ronit, Himmelfarb Jonathan, Bansal Nisha, Lee Su-In
Microsoft Research.
Paul G. Allen School of Computer Science and Engineering, University of Washington.
Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.
Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are popular non-linear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here, we improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model's performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains.
基于树的机器学习模型,如随机森林、决策树和梯度提升树,是流行的非线性预测模型,但相对而言,人们对解释它们的预测关注较少。在这里,我们通过三个主要贡献提高了基于树的模型的可解释性:1)基于博弈论计算最优解释的第一个多项式时间算法。2)一种直接测量局部特征交互效应的新型解释。3)一套基于组合每个预测的多个局部解释来理解全局模型结构的新工具。我们将这些工具应用于三个医学机器学习问题,并展示了如何通过组合多个高质量的局部解释来表示全局结构,同时保持对原始模型的局部忠实性。这些工具使我们能够:i)识别美国人群中高幅度但低频的非线性死亡风险因素;ii)突出具有共同风险特征的不同人群亚组;iii)识别慢性肾病风险因素之间的非线性交互效应;iv)通过识别哪些特征随着时间的推移正在降低模型性能来监测部署在医院中的机器学习模型。鉴于基于树的机器学习模型的流行,这些对其可解释性的改进在广泛的领域中都有影响。