Plumb Gregory, Terhorst Jonathan, Sankararaman Sriram, Talwalkar Ameet
Carnegie Mellon University.
University of Michigan.
Proc Mach Learn Res. 2020 Jul;119:7762-7771.
A common workflow in data exploration is to learn a low-dimensional representation of the data, identify groups of points in that representation, and examine the differences between the groups to determine what they represent. We treat this workflow as an interpretable machine learning problem by leveraging the model that learned the low-dimensional representation to help identify the key differences between the groups. To solve this problem, we introduce a new type of explanation, a Global Counterfactual Explanation (GCE), and our algorithm, Transitive Global Translations (TGT), for computing GCEs. TGT identifies the differences between each pair of groups using compressed sensing but constrains those pairwise differences to be consistent among all of the groups. Empirically, we demonstrate that TGT is able to identify explanations that accurately explain the model while being relatively sparse, and that these explanations match real patterns in the data.
数据探索中的一个常见工作流程是学习数据的低维表示,识别该表示中的点组,并检查这些组之间的差异以确定它们所代表的内容。我们通过利用学习低维表示的模型来帮助识别组之间的关键差异,将此工作流程视为一个可解释的机器学习问题。为了解决这个问题,我们引入了一种新型解释——全局反事实解释(GCE),以及我们用于计算GCE的算法——传递全局翻译(TGT)。TGT使用压缩感知识别每对组之间的差异,但将这些成对差异约束为在所有组中保持一致。从经验上看,我们证明TGT能够识别出在相对稀疏的情况下准确解释模型的解释,并且这些解释与数据中的真实模式相匹配。