IEEE Trans Med Imaging. 2024 Feb;43(2):674-685. doi: 10.1109/TMI.2023.3317088. Epub 2024 Feb 2.
Medical image segmentation and classification are two of the most key steps in computer-aided clinical diagnosis. The region of interest were usually segmented in a proper manner to extract useful features for further disease classification. However, these methods are computationally complex and time-consuming. In this paper, we proposed a one-stage multi-task attention network (MTANet) which efficiently classifies objects in an image while generating a high-quality segmentation mask for each medical object. A reverse addition attention module was designed in the segmentation task to fusion areas in global map and boundary cues in high-resolution features, and an attention bottleneck module was used in the classification task for image feature and clinical feature fusion. We evaluated the performance of MTANet with CNN-based and transformer-based architectures across three imaging modalities for different tasks: CVC-ClinicDB dataset for polyp segmentation, ISIC-2018 dataset for skin lesion segmentation, and our private ultrasound dataset for liver tumor segmentation and classification. Our proposed model outperformed state-of-the-art models on all three datasets and was superior to all 25 radiologists for liver tumor diagnosis.
医学图像分割和分类是计算机辅助临床诊断中最重要的两个步骤。通常需要以适当的方式对感兴趣区域进行分割,以提取用于进一步疾病分类的有用特征。然而,这些方法计算复杂且耗时。在本文中,我们提出了一种单阶段多任务注意力网络(MTANet),该网络可以在对图像中的对象进行高效分类的同时,为每个医学对象生成高质量的分割掩模。在分割任务中设计了一个反向添加注意力模块,用于融合全局图中的区域和高分辨率特征中的边界线索,在分类任务中使用注意力瓶颈模块进行图像特征和临床特征融合。我们使用基于 CNN 和基于 transformer 的架构在三个不同的成像模式下评估了 MTANet 的性能,用于不同的任务:CVC-ClinicDB 数据集用于息肉分割,ISIC-2018 数据集用于皮肤病变分割,以及我们的私人超声数据集用于肝脏肿瘤分割和分类。我们提出的模型在所有三个数据集上的性能均优于最先进的模型,并且在肝脏肿瘤诊断方面优于所有 25 位放射科医生。