Sarker Sushmita, Sarker Prithul, Bebis George, Tavakkoli Alireza
Department of Computer Science and Engineering, University of Nevada, Reno, USA.
Proc IEEE Int Symp Biomed Imaging. 2024 May;2024. doi: 10.1109/isbi56570.2024.10635578. Epub 2024 Aug 22.
Traditional deep learning approaches for breast cancer classification has predominantly concentrated on single-view analysis. In clinical practice, however, radiologists concurrently examine all views within a mammography exam, leveraging the inherent correlations in these views to effectively detect tumors. Acknowledging the significance of multi-view analysis, some studies have introduced methods that independently process mammogram views, either through distinct convolutional branches or simple fusion strategies, inadvertently leading to a loss of crucial inter-view correlations. In this paper, we propose an innovative multi-view network exclusively based on transformers to address challenges in mammographic image classification. Our approach introduces a novel shifted window-based dynamic attention block, facilitating the effective integration of multi-view information and promoting the coherent transfer of this information between views at the spatial feature map level. Furthermore, we conduct a comprehensive comparative analysis of the performance and effectiveness of transformer-based models under diverse settings, employing the CBIS-DDSM and Vin-Dr Mammo datasets. Our code is publicly available at https://github.com/prithuls/MV-Swin-T.
传统的用于乳腺癌分类的深度学习方法主要集中在单视图分析上。然而,在临床实践中,放射科医生会同时检查乳腺钼靶检查中的所有视图,利用这些视图中的内在相关性来有效检测肿瘤。认识到多视图分析的重要性,一些研究已经引入了通过不同的卷积分支或简单融合策略独立处理乳腺钼靶视图的方法,但无意中导致了关键的视图间相关性的丢失。在本文中,我们提出了一种专门基于Transformer的创新多视图网络,以应对乳腺钼靶图像分类中的挑战。我们的方法引入了一种新颖的基于移位窗口的动态注意力块,有助于有效整合多视图信息,并促进在空间特征图级别上该信息在视图之间的连贯传递。此外,我们使用CBIS-DDSM和Vin-Dr Mammo数据集,对不同设置下基于Transformer的模型的性能和有效性进行了全面的比较分析。我们的代码可在https://github.com/prithuls/MV-Swin-T上公开获取。