Abdullakutty Faseela, Akbari Younes, Al-Maadeed Somaya, Bouridane Ahmed, Talaat Iman M, Hamoudi Rifat
Department of Computer Science and Engineering, Qatar University, Doha, Qatar.
Computer Engineering Department, College of Computing and Informatics, University of Sharjah, Sharjah, United Arab Emirates.
Front Med (Lausanne). 2024 Sep 30;11:1450103. doi: 10.3389/fmed.2024.1450103. eCollection 2024.
Precision and timeliness in breast cancer detection are paramount for improving patient outcomes. Traditional diagnostic methods have predominantly relied on unimodal approaches, but recent advancements in medical data analytics have enabled the integration of diverse data sources beyond conventional imaging techniques. This review critically examines the transformative potential of integrating histopathology images with genomic data, clinical records, and patient histories to enhance diagnostic accuracy and comprehensiveness in multi-modal diagnostic techniques. It explores early, intermediate, and late fusion methods, as well as advanced deep multimodal fusion techniques, including encoder-decoder architectures, attention-based mechanisms, and graph neural networks. An overview of recent advancements in multimodal tasks such as Visual Question Answering (VQA), report generation, semantic segmentation, and cross-modal retrieval is provided, highlighting the utilization of generative AI and visual language models. Additionally, the review delves into the role of Explainable Artificial Intelligence (XAI) in elucidating the decision-making processes of sophisticated diagnostic algorithms, emphasizing the critical need for transparency and interpretability. By showcasing the importance of explainability, we demonstrate how XAI methods, including Grad-CAM, SHAP, LIME, trainable attention, and image captioning, enhance diagnostic precision, strengthen clinician confidence, and foster patient engagement. The review also discusses the latest XAI developments, such as X-VARs, LeGrad, LangXAI, LVLM-Interpret, and ex-ILP, to demonstrate their potential utility in multimodal breast cancer detection, while identifying key research gaps and proposing future directions for advancing the field.
乳腺癌检测中的精准性和及时性对于改善患者预后至关重要。传统诊断方法主要依赖单模态方法,但医学数据分析的最新进展使得能够整合常规成像技术以外的各种数据源。本综述批判性地研究了将组织病理学图像与基因组数据、临床记录和患者病史相结合以提高多模态诊断技术的诊断准确性和全面性的变革潜力。它探讨了早期、中期和晚期融合方法,以及先进的深度多模态融合技术,包括编码器-解码器架构、基于注意力的机制和图神经网络。提供了对视觉问答(VQA)、报告生成、语义分割和跨模态检索等多模态任务的最新进展的概述,突出了生成式人工智能和视觉语言模型的应用。此外,该综述深入探讨了可解释人工智能(XAI)在阐明复杂诊断算法的决策过程中的作用,强调了透明度和可解释性的迫切需求。通过展示可解释性的重要性,我们展示了包括Grad-CAM、SHAP、LIME、可训练注意力和图像字幕在内的XAI方法如何提高诊断精度、增强临床医生信心并促进患者参与。该综述还讨论了XAI的最新发展,如X-VARs、LeGrad、LangXAI、LVLM-Interpret和ex-ILP,以展示它们在多模态乳腺癌检测中的潜在效用,同时确定关键研究差距并提出推动该领域发展的未来方向。