Kaczmarzyk Jakub R, Saltz Joel H, Koo Peter K
Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA.
Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
ArXiv. 2024 Nov 18:arXiv:2409.03080v2.
Deep learning models hold great promise for digital pathology, but their opaque decision-making processes undermine trust and hinder clinical adoption. Explainable AI methods are essential to enhance model transparency and reliability.
We developed HIPPO, an explainable AI framework that systematically modifies tissue regions in whole slide images to generate image counterfactuals, enabling quantitative hypothesis testing, bias detection, and model evaluation beyond traditional performance metrics. HIPPO was applied to a variety of clinically important tasks, including breast metastasis detection in axillary lymph nodes, prognostication in breast cancer and melanoma, and mutation classification in gliomas. In computational experiments, HIPPO was compared against traditional metrics and attention-based approaches to assess its ability to identify key tissue elements driving model predictions.
In metastasis detection, HIPPO uncovered critical model limitations that were undetectable by standard performance metrics or attention-based methods. For prognostic prediction, HIPPO outperformed attention by providing more nuanced insights into tissue elements influencing outcomes. In a proof-of-concept study, HIPPO facilitated hypothesis generation for identifying melanoma patients who may benefit from immunotherapy. In mutation classification, HIPPO more robustly identified the pathology regions responsible for false negatives compared to attention, suggesting its potential to outperform attention in explaining model decisions.
HIPPO expands the explainable AI toolkit for computational pathology by enabling deeper insights into model behavior. This framework supports the trustworthy development, deployment, and regulation of weakly-supervised models in clinical and research settings, promoting their broader adoption in digital pathology.
深度学习模型在数字病理学领域具有巨大潜力,但其不透明的决策过程破坏了信任并阻碍了临床应用。可解释人工智能方法对于提高模型的透明度和可靠性至关重要。
我们开发了HIPPO,这是一个可解释的人工智能框架,它系统地修改全切片图像中的组织区域以生成图像反事实,从而能够进行定量假设检验、偏差检测以及超越传统性能指标的模型评估。HIPPO被应用于各种临床重要任务,包括腋窝淋巴结中的乳腺转移检测、乳腺癌和黑色素瘤的预后预测以及神经胶质瘤中的突变分类。在计算实验中,将HIPPO与传统指标和基于注意力的方法进行比较,以评估其识别驱动模型预测的关键组织元素的能力。
在转移检测中,HIPPO发现了标准性能指标或基于注意力的方法无法检测到的关键模型局限性。对于预后预测,HIPPO通过对影响结果的组织元素提供更细致入微的见解,优于基于注意力的方法。在一项概念验证研究中,HIPPO促进了假设生成,以识别可能从免疫治疗中受益的黑色素瘤患者。在突变分类中,与基于注意力的方法相比,HIPPO更稳健地识别了导致假阴性的病理区域,表明其在解释模型决策方面优于基于注意力的方法的潜力。
HIPPO通过能够更深入地洞察模型行为,扩展了用于计算病理学的可解释人工智能工具包。该框架支持在临床和研究环境中对弱监督模型进行可靠的开发、部署和监管,促进其在数字病理学中的更广泛应用。