基于Grad-CAM的与医学文本处理相关的可解释人工智能

Grad-CAM-Based Explainable Artificial Intelligence Related to Medical Text Processing.

作者信息

Zhang Hongjian, Ogasawara Katsuhiko

机构信息

Graduate School of Health Science, Hokkaido University, N12-W5, Kitaku, Sapporo 060-0812, Japan.

出版信息

Bioengineering (Basel). 2023 Sep 10;10(9):1070. doi: 10.3390/bioengineering10091070.

DOI:10.3390/bioengineering10091070

PMID:37760173

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10525184/

Abstract

The opacity of deep learning makes its application challenging in the medical field. Therefore, there is a need to enable explainable artificial intelligence (XAI) in the medical field to ensure that models and their results can be explained in a manner that humans can understand. This study uses a high-accuracy computer vision algorithm model to transfer learning to medical text tasks and uses the explanatory visualization method known as gradient-weighted class activation mapping (Grad-CAM) to generate heat maps to ensure that the basis for decision-making can be provided intuitively or via the model. The system comprises four modules: pre-processing, word embedding, classifier, and visualization. We used Word2Vec and BERT to compare word embeddings and use ResNet and 1Dimension convolutional neural networks (CNN) to compare classifiers. Finally, the Bi-LSTM was used to perform text classification for direct comparison. With 25 epochs, the model that used pre-trained ResNet on the formalized text presented the best performance (recall of 90.9%, precision of 91.1%, and an F1 score of 90.2% weighted). This study uses ResNet to process medical texts through Grad-CAM-based explainable artificial intelligence and obtains a high-accuracy classification effect; at the same time, through Grad-CAM visualization, it intuitively shows the words to which the model pays attention when making predictions.

摘要

深度学习的不透明性使其在医学领域的应用具有挑战性。因此，有必要在医学领域实现可解释人工智能（XAI），以确保模型及其结果能够以人类可理解的方式得到解释。本研究使用高精度计算机视觉算法模型将迁移学习应用于医学文本任务，并使用称为梯度加权类激活映射（Grad-CAM）的解释性可视化方法生成热图，以确保能够直观地或通过模型提供决策依据。该系统包括四个模块：预处理、词嵌入、分类器和可视化。我们使用Word2Vec和BERT比较词嵌入，并使用ResNet和一维卷积神经网络（CNN）比较分类器。最后，使用双向长短期记忆网络（Bi-LSTM）进行文本分类以进行直接比较。经过25个轮次的训练，在形式化文本上使用预训练ResNet的模型表现出最佳性能（召回率为90.9%，精确率为91.1%，加权F1分数为90.2%）。本研究通过基于Grad-CAM的可解释人工智能使用ResNet处理医学文本，获得了高精度的分类效果；同时，通过Grad-CAM可视化，直观地展示了模型在进行预测时关注的词汇。