Yuan Jun, Barr Brian, Overton Kyle, Bertini Enrico
IEEE Trans Vis Comput Graph. 2022 Nov 3;PP. doi: 10.1109/TVCG.2022.3219232.
One of the potential solutions for model interpretation is to train a surrogate model: a more transparent model that approximates the behavior of the model to be explained. Typically, classification rules or decision trees are used due to their logic-based expressions. However, decision trees can grow too deep, and rule sets can become too large to approximate a complex model. Unlike paths on a decision tree that must share ancestor nodes (conditions), rules are more flexible. However, the unstructured visual representation of rules makes it hard to make inferences across rules. In this paper, we focus on tabular data and present novel algorithmic and interactive solutions to address these issues. First, we present Hierarchical Surrogate Rules (HSR), an algorithm that generates hierarchical rules based on user-defined parameters. We also contribute SuRE, a visual analytics (VA) system that integrates HSR and an interactive surrogate rule visualization, the Feature-Aligned Tree, which depicts rules as trees while aligning features for easier comparison. We evaluate the algorithm in terms of parameter sensitivity, time performance, and comparison with surrogate decision trees and find that it scales reasonably well and overcomes the shortcomings of surrogate decision trees. We evaluate the visualization and the system through a usability study and an observational study with domain experts. Our investigation shows that the participants can use feature-aligned trees to perform non-trivial tasks with very high accuracy. We also discuss many interesting findings, including a rule analysis task characterization, that can be used for visualization design and future research.
一个更透明的模型,它近似于要解释的模型的行为。通常,由于其基于逻辑的表达式,会使用分类规则或决策树。然而,决策树可能长得太深,规则集可能变得太大而无法近似一个复杂的模型。与决策树上必须共享祖先节点(条件)的路径不同,规则更加灵活。然而,规则的非结构化可视化表示使得跨规则进行推理变得困难。在本文中,我们专注于表格数据,并提出新颖的算法和交互式解决方案来解决这些问题。首先,我们提出了分层替代规则(HSR),一种基于用户定义参数生成分层规则的算法。我们还贡献了SuRE,一个可视化分析(VA)系统,它集成了HSR和交互式替代规则可视化,即特征对齐树,它将规则描绘为树,同时对齐特征以便于比较。我们从参数敏感性、时间性能以及与替代决策树的比较等方面评估了该算法,发现它具有合理的扩展性,并克服了替代决策树的缺点。我们通过可用性研究和与领域专家的观察性研究来评估可视化和系统。我们的调查表明,参与者可以使用特征对齐树以非常高的准确率执行重要任务。我们还讨论了许多有趣的发现,包括可用于可视化设计和未来研究的规则分析任务特征。