Tešić Marko, Hahn Ulrike
Birkbeck, University of London, London, UK.
Patterns (N Y). 2022 Dec 9;3(12):100635. doi: 10.1016/j.patter.2022.100635.
Counterfactual (CF) explanations have been employed as one of the modes of explainability in explainable artificial intelligence (AI)-both to increase the transparency of AI systems and to provide recourse. Cognitive science and psychology have pointed out that people regularly use CFs to express causal relationships. Most AI systems, however, are only able to capture associations or correlations in data, so interpreting them as casual would not be justified. In this perspective, we present two experiments (total n = 364) exploring the effects of CF explanations of AI systems' predictions on lay people's causal beliefs about the real world. In Experiment 1, we found that providing CF explanations of an AI system's predictions does indeed (unjustifiably) affect people's causal beliefs regarding factors/features the AI uses and that people are more likely to view them as causal factors in the real world. Inspired by the literature on misinformation and health warning messaging, Experiment 2 tested whether we can correct for the unjustified change in causal beliefs. We found that pointing out that AI systems capture correlations and not necessarily causal relationships can attenuate the effects of CF explanations on people's causal beliefs.
反事实(CF)解释已被用作可解释人工智能(AI)中的一种可解释性模式——既用于提高AI系统的透明度,也用于提供补救措施。认知科学和心理学指出,人们经常使用反事实来表达因果关系。然而,大多数AI系统只能捕捉数据中的关联或相关性,因此将它们解释为因果关系是不合理的。从这个角度出发,我们进行了两项实验(总样本量n = 364),探究AI系统预测的反事实解释对普通人关于现实世界的因果信念的影响。在实验1中,我们发现,提供AI系统预测的反事实解释确实(不合理地)影响了人们对AI所使用的因素/特征的因果信念,并且人们更有可能将它们视为现实世界中的因果因素。受关于错误信息和健康警示信息的文献启发,实验2测试了我们是否可以纠正因果信念中的不合理变化。我们发现,指出AI系统捕捉的是相关性而不一定是因果关系,可以减弱反事实解释对人们因果信念的影响。