IEEE J Biomed Health Inform. 2024 Nov;28(11):6466-6473. doi: 10.1109/JBHI.2024.3395289. Epub 2024 Nov 6.
Explainable Artificial Intelligence (XAI) provides tools to help understanding how AI models work and reach a particular decision or outcome. It helps to increase the interpretability of models and makes them more trustworthy and transparent. In this context, many XAI methods have been proposed to make black-box and complex models more digestible from a human perspective. However, one of the main issues that XAI methods have to face especially when dealing with a high number of features is the presence of multicollinearity, which casts shadows on the robustness of the XAI outcomes, such as the ranking of informative features. Most of the current XAI methods either do not consider the collinearity or assume the features are independent which, in general, is not necessarily true. Here, we propose a simple, yet useful, proxy that modifies the outcome of any XAI feature ranking method allowing to account for the dependency among the features, and to reveal their impact on the outcome. The proposed method was applied to SHAP, as an example of XAI method which assume that the features are independent. For this purpose, several models were exploited for a well-known classification task (males versus females) using nine cardiac phenotypes extracted from cardiac magnetic resonance imaging as features. Principal component analysis and biological plausibility were employed to validate the proposed method. Our results showed that the proposed proxy could lead to a more robust list of informative features compared to the original SHAP in presence of collinearity.
可解释人工智能 (XAI) 提供了工具,帮助人们了解 AI 模型的工作方式以及得出特定决策或结果的原因。它有助于提高模型的可解释性,使其更值得信赖和透明。在这种情况下,已经提出了许多 XAI 方法,以使黑盒和复杂的模型从人类的角度更容易理解。然而,XAI 方法必须面对的一个主要问题是,特别是在处理大量特征时,存在多重共线性,这会影响 XAI 结果的稳健性,例如信息特征的排名。目前大多数 XAI 方法要么不考虑共线性,要么假设特征是独立的,但通常情况下,这并不一定正确。在这里,我们提出了一种简单而有用的代理,它可以修改任何 XAI 特征排名方法的结果,以考虑特征之间的依赖性,并揭示它们对结果的影响。所提出的方法应用于 SHAP,作为一种假设特征独立的 XAI 方法的示例。为此,使用从心脏磁共振成像中提取的九个心脏表型作为特征,利用几个模型来完成一个众所周知的分类任务(男性与女性)。主成分分析和生物学合理性被用来验证所提出的方法。我们的结果表明,在所提出的代理方法中,在存在共线性的情况下,与原始的 SHAP 相比,它可以得出更稳健的信息特征列表。