Ehsan Upol, Riedl Mark O
Georgia Institute of Technology, Atlanta GA, USA.
Patterns (N Y). 2024 Jun 14;5(6):100971. doi: 10.1016/j.patter.2024.100971.
To make explainable artificial intelligence (XAI) systems trustworthy, understanding harmful effects is important. In this paper, we address an important yet unarticulated type of negative effect in XAI. We introduce explainability pitfalls (EPs), unanticipated negative downstream effects from AI explanations manifesting even when there is no intention to manipulate users. EPs are different from dark patterns, which are intentionally deceptive practices. We articulate the concept of EPs by demarcating it from dark patterns and highlighting the challenges arising from uncertainties around pitfalls. We situate and operationalize the concept using a case study that showcases how, despite best intentions, unsuspecting negative effects, such as unwarranted trust in numerical explanations, can emerge. We propose proactive and preventative strategies to address EPs at three interconnected levels: research, design, and organizational. We discuss design and societal implications around reframing AI adoption, recalibrating stakeholder empowerment, and resisting the "move fast and break things" mindset.
为了使可解释的人工智能(XAI)系统值得信赖,了解其有害影响至关重要。在本文中,我们探讨了XAI中一种重要但尚未明确阐述的负面影响类型。我们引入了可解释性陷阱(EP),即即使没有操纵用户的意图,人工智能解释也会产生意想不到的负面下游效应。EP与黑暗模式不同,黑暗模式是故意的欺骗行为。我们通过将EP与黑暗模式区分开来,并强调围绕陷阱的不确定性所带来的挑战,来阐述EP的概念。我们通过一个案例研究来定位和实施这一概念,该案例展示了尽管出于善意,但仍可能出现意想不到的负面影响,比如对数值解释的无端信任。我们提出了积极主动的预防策略,以在研究、设计和组织这三个相互关联的层面应对EP。我们讨论了围绕重新构建人工智能应用、重新校准利益相关者赋权以及抵制“快速行动并打破常规”思维模式的设计和社会影响。