Shareef Bryar, Xian Min, Vakanski Aleksandar, Wang Haotian
Department of Computer Science, University of Idaho, Idaho Falls, Idaho 83402, USA.
Med Image Comput Comput Assist Interv. 2023 Oct;14223:344-353. doi: 10.1007/978-3-031-43901-8_33. Epub 2023 Oct 1.
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification. Although convolutional neural networks (CNNs) have demonstrated reliable performance in tumor classification, they have inherent limitations for modeling global and long-range dependencies due to the localized nature of convolution operations. Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations. In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation using a hybrid architecture composed of CNNs and Swin Transformer components. The proposed approach was compared to nine BUS classification methods and evaluated using seven quantitative metrics on a dataset of 3,320 BUS images. The results indicate that Hybrid-MT-ESTAN achieved the highest accuracy, sensitivity, and F1 score of 82.7%, 86.4%, and 86.0%, respectively.
捕捉全局上下文信息在乳腺超声(BUS)图像分类中起着关键作用。尽管卷积神经网络(CNN)在肿瘤分类中已展现出可靠的性能,但由于卷积操作的局部性,它们在对全局和长距离依赖性进行建模时存在固有局限性。视觉Transformer具有更强的捕捉全局上下文信息的能力,但由于令牌化操作可能会扭曲局部图像模式。在本研究中,我们提出了一种名为Hybrid-MT-ESTAN的混合多任务深度神经网络,旨在使用由CNN和Swin Transformer组件组成的混合架构进行BUS肿瘤分类和分割。将所提出的方法与九种BUS分类方法进行了比较,并在一个包含3320张BUS图像的数据集上使用七种定量指标进行了评估。结果表明,Hybrid-MT-ESTAN分别实现了最高的准确率、灵敏度和F1分数,分别为82.7%、86.4%和86.0%。