Graziani Mara, Dutkiewicz Lidia, Calvaresi Davide, Amorim José Pereira, Yordanova Katerina, Vered Mor, Nair Rahul, Abreu Pedro Henriques, Blanke Tobias, Pulignano Valeria, Prior John O, Lauwaert Lode, Reijers Wessel, Depeursinge Adrien, Andrearczyk Vincent, Müller Henning
University of Applied Sciences of Western Switzerland (HES-SO Valais), Rue du Technopole 3, Sierre, 3960 Valais Switzerland.
Department of Computer Science, University of Geneva (UniGe), Route de Drize 7, Carouge, 1227 Vaud Switzerland.
Artif Intell Rev. 2023;56(4):3473-3504. doi: 10.1007/s10462-022-10256-8. Epub 2022 Sep 6.
Since its emergence in the 1960s, Artificial Intelligence (AI) has grown to conquer many technology products and their fields of application. Machine learning, as a major part of the current AI solutions, can learn from the data and through experience to reach high performance on various tasks. This growing success of AI algorithms has led to a need for interpretability to understand opaque models such as deep neural networks. Various requirements have been raised from different domains, together with numerous tools to debug, justify outcomes, and establish the safety, fairness and reliability of the models. This variety of tasks has led to inconsistencies in the terminology with, for instance, terms such as , and being often used interchangeably in methodology papers. These words, however, convey different meanings and are "weighted" differently across domains, for example in the technical and social sciences. In this paper, we propose an overarching terminology of interpretability of AI systems that can be referred to by the technical developers as much as by the social sciences community to pursue clarity and efficiency in the definition of regulations for ethical and reliable AI development. We show how our taxonomy and definition of interpretable AI differ from the ones in previous research and how they apply with high versatility to several domains and use cases, proposing a-highly needed-standard for the communication among interdisciplinary areas of AI.
自20世纪60年代出现以来,人工智能(AI)不断发展,征服了许多科技产品及其应用领域。机器学习作为当前人工智能解决方案的主要组成部分,可以从数据中学习并通过经验在各种任务上实现高性能。人工智能算法的这种日益成功引发了对可解释性的需求,以便理解诸如深度神经网络等不透明模型。不同领域提出了各种要求,同时也出现了众多用于调试、证明结果以及确立模型的安全性、公平性和可靠性的工具。这种多样化的任务导致了术语上的不一致,例如,在方法论论文中,诸如“可解释性”“可理解性”和“透明度”等术语经常互换使用。然而,这些词传达的含义不同,并且在不同领域(例如技术领域和社会科学领域)的“权重”也不同。在本文中,我们提出了一种人工智能系统可解释性的总体术语,技术开发者和社会科学界都可以参考,以便在道德和可靠的人工智能开发法规定义中追求清晰性和效率。我们展示了我们对可解释人工智能的分类法和定义与先前研究中的有何不同,以及它们如何高度通用地应用于多个领域和用例,为人工智能跨学科领域之间的交流提出了一个急需的标准。