Pham Trong-Thang, Brecheisen Jacob, Wu Carol C, Nguyen Hien, Deng Zhigang, Adjeroh Donald, Doretto Gianfranco, Choudhary Arabinda, Le Ngan
AICV Lab, Department of EECS, University of Arkansas, AR 72701, USA.
MD Anderson Cancer Center, Houston, TX 77079, USA.
Artif Intell Med. 2025 Feb;160:103054. doi: 10.1016/j.artmed.2024.103054. Epub 2024 Dec 12.
Using Deep Learning in computer-aided diagnosis systems has been of great interest due to its impressive performance in the general domain and medical domain. However, a notable challenge is the lack of explainability of many advanced models, which poses risks in critical applications such as diagnosing findings in CXR. To address this problem, we propose ItpCtrl-AI, a novel end-to-end interpretable and controllable framework that mirrors the decision-making process of the radiologist. By emulating the eye gaze patterns of radiologists, our framework initially determines the focal areas and assesses the significance of each pixel within those regions. As a result, the model generates an attention heatmap representing radiologists' attention, which is then used to extract attended visual information to diagnose the findings. By allowing the directional input, our framework is controllable by the user. Furthermore, by displaying the eye gaze heatmap which guides the diagnostic conclusion, the underlying rationale behind the model's decision is revealed, thereby making it interpretable. In addition to developing an interpretable and controllable framework, our work includes the creation of a dataset, named Diagnosed-Gaze++, which aligns medical findings with eye gaze data. Our extensive experimentation validates the effectiveness of our approach in generating accurate attention heatmaps and diagnoses. The experimental results show that our model not only accurately identifies medical findings but also precisely produces the eye gaze attention of radiologists. The dataset, models, and source code will be made publicly available upon acceptance.
在计算机辅助诊断系统中使用深度学习已引起了极大的关注,这是因为它在一般领域和医学领域都有着令人印象深刻的表现。然而,一个显著的挑战是许多先进模型缺乏可解释性,这在诸如诊断胸部X光片中的发现等关键应用中带来了风险。为了解决这个问题,我们提出了ItpCtrl-AI,这是一个新颖的端到端可解释且可控的框架,它模仿了放射科医生的决策过程。通过模拟放射科医生的眼睛注视模式,我们的框架首先确定重点区域,并评估这些区域内每个像素的重要性。结果,该模型生成一个表示放射科医生注意力的注意力热图,然后用于提取关注的视觉信息以诊断发现。通过允许定向输入,我们的框架可由用户控制。此外,通过显示指导诊断结论的眼睛注视热图,揭示了模型决策背后的基本原理,从而使其具有可解释性。除了开发一个可解释且可控的框架外,我们的工作还包括创建一个名为Diagnosed-Gaze++的数据集,该数据集将医学发现与眼睛注视数据对齐。我们广泛的实验验证了我们的方法在生成准确的注意力热图和诊断方面的有效性。实验结果表明,我们的模型不仅能准确识别医学发现,还能精确地生成放射科医生的眼睛注视注意力。该数据集、模型和源代码将在被接受后公开提供。