眼动引导的多模态融合：迈向使用可解释人工智能的自适应学习框架

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence.

作者信息

Moradizeyveh Sahar, Hanif Ambreen, Liu Sidong, Qi Yuankai, Beheshti Amin, Di Ieva Antonio

机构信息

Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney 2113, Australia.

Centre for Applied Artificial Intelligence, School of Computing, Faculty of Science and Engineering, Macquarie University, Sydney 2113, Australia.

出版信息

Sensors (Basel). 2025 Jul 24;25(15):4575. doi: 10.3390/s25154575.

DOI:10.3390/s25154575

PMID:40807742

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12349219/

Abstract

Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning and decision-making in medical image interpretation. By integrating chest X-ray (CXR) images with expert fixation maps, our approach captures radiologists' visual attention patterns and highlights regions of interest (ROIs) critical for accurate diagnosis. The fusion model utilizes a shared backbone architecture to jointly process image and gaze modalities, thereby minimizing the impact of noise in fixation data. We validate the system's interpretability using Gradient-weighted Class Activation Mapping (Grad-CAM) and assess both classification performance and explanation alignment with expert annotations. Comprehensive evaluations, including robustness under gaze noise and expert clinical review, demonstrate the framework's effectiveness in improving model reliability and interpretability. This work offers a promising pathway toward intelligent, human-centered AI systems that support both diagnostic accuracy and medical training.

摘要

解读诊断成像并识别临床相关特征仍然是具有挑战性的任务，特别是对于那些常常缺乏结构化指导和专家反馈的新手放射科医生而言。为了弥补这一差距，我们提出了一种眼动引导的多模态融合框架，该框架利用专家眼动追踪数据来加强医学图像解读中的学习和决策。通过将胸部X光（CXR）图像与专家注视图相结合，我们的方法捕捉放射科医生的视觉注意力模式，并突出显示对准确诊断至关重要的感兴趣区域（ROI）。融合模型利用共享骨干架构来联合处理图像和注视模态，从而将注视数据中的噪声影响降至最低。我们使用梯度加权类激活映射（Grad-CAM）验证系统的可解释性，并评估分类性能以及与专家注释的解释一致性。包括注视噪声下的稳健性和专家临床审查在内的综合评估证明了该框架在提高模型可靠性和可解释性方面的有效性。这项工作为支持诊断准确性和医学培训的智能、以人为本的人工智能系统提供了一条充满希望的途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af38/12349219/78f5b720cd22/sensors-25-04575-g001.jpg

相似文献

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence.

Sensors (Basel). 2025 Jul 24;25(15):4575. doi: 10.3390/s25154575.

CXR-MultiTaskNet a unified deep learning framework for joint disease localization and classification in chest radiographs.

Sci Rep. 2025 Aug 31;15(1):32022. doi: 10.1038/s41598-025-16669-z.

Shedding light on ai in radiology: A systematic review and taxonomy of eye gaze-driven interpretability in deep learning.

Eur J Radiol. 2024 Mar;172:111341. doi: 10.1016/j.ejrad.2024.111341. Epub 2024 Feb 1.

A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.

Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.

Are Artificial Intelligence Models Listening Like Cardiologists? Bridging the Gap Between Artificial Intelligence and Clinical Reasoning in Heart-Sound Classification Using Explainable Artificial Intelligence.

Bioengineering (Basel). 2025 May 22;12(6):558. doi: 10.3390/bioengineering12060558.

Deep Learning and Image Generator Health Tabular Data (IGHT) for Predicting Overall Survival in Patients With Colorectal Cancer: Retrospective Study.

JMIR Med Inform. 2025 Aug 19;13:e75022. doi: 10.2196/75022.

ItpCtrl-AI: End-to-end interpretable and controllable artificial intelligence by modeling radiologists' intentions.

Artif Intell Med. 2025 Feb;160:103054. doi: 10.1016/j.artmed.2024.103054. Epub 2024 Dec 12.

Novel Artificial Intelligence-Driven Infant Meningitis Screening From High-Resolution Ultrasound Imaging.

Ultrasound Med Biol. 2025 Jun 28. doi: 10.1016/j.ultrasmedbio.2025.04.009.

Explainable Artificial Intelligence (XAI) in the Era of Large Language Models: Applying an XAI Framework in Pediatric Ophthalmology Diagnosis using the Gemini Model.

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:566-575. eCollection 2025.

Deep learning-based image classification for AI-assisted integration of pathology and radiology in medical imaging.

Front Med (Lausanne). 2025 Jun 2;12:1574514. doi: 10.3389/fmed.2025.1574514. eCollection 2025.

本文引用的文献

Eye Gaze Guided Cross-Modal Alignment Network for Radiology Report Generation.

IEEE J Biomed Health Inform. 2024 Dec;28(12):7406-7419. doi: 10.1109/JBHI.2024.3422168. Epub 2024 Dec 5.

Analyzing Eye Paths Using Fractals.

Adv Neurobiol. 2024;36:827-848. doi: 10.1007/978-3-031-47606-8_42.

Eye-Gaze-Guided Vision Transformer for Rectifying Shortcut Learning.

IEEE Trans Med Imaging. 2023 Nov;42(11):3384-3394. doi: 10.1109/TMI.2023.3287572. Epub 2023 Oct 27.

Explainable AI in medical imaging: An overview for clinical practitioners - Beyond saliency-based XAI approaches.

Eur J Radiol. 2023 May;162:110786. doi: 10.1016/j.ejrad.2023.110786. Epub 2023 Mar 20.

Skill Characterisation of Sonographer Gaze Patterns during Second Trimester Clinical Fetal Ultrasounds using Time Curves.

Proc Eye Track Res Appl Symp. 2022 Jun;2022. doi: 10.1145/3517031.3529637. Epub 2022 Jun 8.

Changes in Radiologists' Gaze Patterns Against Lung X-rays with Different Abnormalities: a Randomized Experiment.

J Digit Imaging. 2023 Jun;36(3):767-775. doi: 10.1007/s10278-022-00760-2. Epub 2023 Jan 9.

REFLACX, a dataset of reports and eye-tracking data for localization of abnormalities in chest x-rays.

Sci Data. 2022 Jun 18;9(1):350. doi: 10.1038/s41597-022-01441-z.

Explainable artificial intelligence (XAI) in deep learning-based medical image analysis.

Med Image Anal. 2022 Jul;79:102470. doi: 10.1016/j.media.2022.102470. Epub 2022 May 4.

Recent advances and clinical applications of deep learning in medical image analysis.

Med Image Anal. 2022 Jul;79:102444. doi: 10.1016/j.media.2022.102444. Epub 2022 Apr 4.

On Smart Gaze Based Annotation of Histopathology Images for Training of Deep Convolutional Neural Networks.

IEEE J Biomed Health Inform. 2022 Jul;26(7):3025-3036. doi: 10.1109/JBHI.2022.3148944. Epub 2022 Jul 1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

眼动引导的多模态融合：迈向使用可解释人工智能的自适应学习框架

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence.

作者信息

Moradizeyveh Sahar, Hanif Ambreen, Liu Sidong, Qi Yuankai, Beheshti Amin, Di Ieva Antonio

机构信息

Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney 2113, Australia.

Centre for Applied Artificial Intelligence, School of Computing, Faculty of Science and Engineering, Macquarie University, Sydney 2113, Australia.

出版信息

Sensors (Basel). 2025 Jul 24;25(15):4575. doi: 10.3390/s25154575.

DOI:10.3390/s25154575

PMID:40807742

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12349219/

Abstract

摘要

眼动引导的多模态融合：迈向使用可解释人工智能的自适应学习框架

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

眼动引导的多模态融合：迈向使用可解释人工智能的自适应学习框架

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence.

作者信息

机构信息

出版信息

相似文献

本文引用的文献