Lage Isaac, Ross Andrew Slavin, Kim Been, Gershman Samuel J, Doshi-Velez Finale
Department of Computer Science, Harvard University.
Google Brain.
Adv Neural Inf Process Syst. 2018 Dec;31.
We often desire our models to be interpretable as well as accurate. Prior work on optimizing models for interpretability has relied on easy-to-quantify proxies for interpretability, such as sparsity or the number of operations required. In this work, we optimize for interpretability by including humans in the optimization loop. We develop an algorithm that minimizes the number of user studies to find models that are both predictive and interpretable and demonstrate our approach on several data sets. Our human subjects results show trends towards different proxy notions of interpretability on different datasets, which suggests that different proxies are preferred on different tasks.
我们通常期望我们的模型既准确又可解释。先前为提高模型可解释性而进行的工作依赖于易于量化的可解释性代理指标,例如稀疏性或所需的操作数量。在这项工作中,我们通过将人类纳入优化循环来优化可解释性。我们开发了一种算法,该算法可最大限度地减少用户研究的数量,以找到既具有预测性又具有可解释性的模型,并在多个数据集上展示了我们的方法。我们的人体实验结果表明,在不同数据集上,可解释性的不同代理概念呈现出不同趋势,这表明在不同任务中,人们更喜欢不同的代理指标。