Lokaj Belinda, Durand de Gevigney Valentin, Djema Dahila-Amal, Zaghir Jamil, Goldman Jean-Philippe, Bjelogrlic Mina, Turbé Hugues, Kinkel Karen, Lovis Christian, Schmid Jérôme
Geneva School of Health Sciences, HES-SO University of Applied Sciences and Arts Western Switzerland, Delémont, Switzerland; Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland.
Geneva School of Health Sciences, HES-SO University of Applied Sciences and Arts Western Switzerland, Delémont, Switzerland.
Comput Biol Med. 2025 Apr;188:109721. doi: 10.1016/j.compbiomed.2025.109721. Epub 2025 Feb 19.
Breast cancer is the most common cancer worldwide, and magnetic resonance imaging (MRI) constitutes a very sensitive technique for invasive cancer detection. When reviewing breast MRI examination, clinical radiologists rely on multimodal information, composed of imaging data but also information not present in the images such as clinical information. Most machine learning (ML) approaches are not well suited for multimodal data. However, attention-based architectures, such as Transformers, are flexible and therefore good candidates for integrating multimodal data.
The aim of this study was to develop and evaluate a novel multimodal deep learning (DL) model combining ultrafast dynamic contrast-enhanced (UF-DCE) MRI images, lesion characteristics and clinical information for breast lesion classification.
From 2019 to 2023, UF-DCE breast images and radiology reports of 240 patients were retrospectively collected from a single clinical center and annotated. Imaging data were constituted of volumes of interest (VOI) extracted around segmented lesions. Non-imaging data were constituted of both clinical (categorical) and geometrical (scalar) data. Clinical data were extracted from annotated reports and were associated to their corresponding lesions. We compared the diagnostic performances of traditional ML methods for non-imaging data, an image model based on the DL architecture, and a novel Transformer-based architecture, the Multimodal Sieve Transformer with Vision Transformer encoder (MMST-V).
The final dataset included 987 lesions (280 benign, 121 malignant lesions, and 586 benign lymph nodes) and 1081 reports. For classification with non-imaging data, scalar data had a greater influence on performances of lesion classification (Area under the receiver operating characteristic curve (AUROC) = 0.875 ± 0.042) than categorical data (AUROC = 0.680 ± 0.060). MMST-V achieved better performances (AUROC = 0.928 ± 0.027) than classification based on non-imaging data (AUROC = 0.900 ± 0.045), and imaging data only (AUROC = 0.863 ± 0.025).
The proposed MMST-V is an adaptative approach that can consider redundant information provided by multimodal information. It demonstrated better performances than unimodal methods. Results highlight that the combination of clinical patient data and detailed lesion information as additional clinical knowledge enhances the diagnostic performances of UF-DCE breast MRI.
乳腺癌是全球最常见的癌症,磁共振成像(MRI)是检测浸润性癌症的一种非常敏感的技术。在审查乳腺MRI检查时,临床放射科医生依赖多模态信息,这些信息由成像数据以及图像中不存在的信息(如临床信息)组成。大多数机器学习(ML)方法不太适合多模态数据。然而,基于注意力的架构,如Transformer,具有灵活性,因此是整合多模态数据的理想选择。
本研究的目的是开发并评估一种新型多模态深度学习(DL)模型,该模型结合超快动态对比增强(UF-DCE)MRI图像、病变特征和临床信息用于乳腺病变分类。
从2019年到2023年,回顾性收集了来自单一临床中心的240例患者的UF-DCE乳腺图像和放射学报告并进行注释。成像数据由在分割病变周围提取的感兴趣体积(VOI)组成。非成像数据由临床(分类)和几何(标量)数据组成。临床数据从注释报告中提取,并与相应病变相关联。我们比较了传统ML方法对非成像数据、基于DL架构的图像模型以及一种新型基于Transformer的架构——带视觉Transformer编码器的多模态筛Transformer(MMST-V)的诊断性能。
最终数据集包括987个病变(280个良性病变、121个恶性病变和586个良性淋巴结)以及1081份报告。对于非成像数据分类,标量数据对病变分类性能的影响(受试者操作特征曲线下面积(AUROC)=0.875±0.042)大于分类数据(AUROC=0.680±0.060)。MMST-V的性能(AUROC=0.928±0.027)优于基于非成像数据的分类(AUROC=0.900±0.045)以及仅基于成像数据的分类(AUROC=0.863±0.025)。
所提出的MMST-V是一种适应性方法,能够考虑多模态信息提供的冗余信息。它表现出比单模态方法更好的性能。结果表明,将临床患者数据和详细病变信息作为额外的临床知识相结合,可提高UF-DCE乳腺MRI的诊断性能。