Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610213, Sichuan, China.
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101499, China.
BMC Med Imaging. 2024 Sep 16;24(1):241. doi: 10.1186/s12880-024-01401-6.
Recently emerged SAM-Med2D represents a state-of-the-art advancement in medical image segmentation. Through fine-tuning the Large Visual Model, Segment Anything Model (SAM), on extensive medical datasets, it has achieved impressive results in cross-modal medical image segmentation. However, its reliance on interactive prompts may restrict its applicability under specific conditions. To address this limitation, we introduce SAM-AutoMed, which achieves automatic segmentation of medical images by replacing the original prompt encoder with an improved MobileNet v3 backbone. The performance on multiple datasets surpasses both SAM and SAM-Med2D. Current enhancements on the Large Visual Model SAM lack applications in the field of medical image classification. Therefore, we introduce SAM-MedCls, which combines the encoder of SAM-Med2D with our designed attention modules to construct an end-to-end medical image classification model. It performs well on datasets of various modalities, even achieving state-of-the-art results, indicating its potential to become a universal model for medical image classification.
最近出现的 SAM-Med2D 代表了医学图像分割领域的最新进展。通过在大量医学数据集上对大型视觉模型 Segment Anything Model (SAM) 进行微调,它在跨模态医学图像分割方面取得了令人印象深刻的成果。然而,它对交互式提示的依赖可能会限制其在特定条件下的适用性。为了解决这个局限性,我们引入了 SAM-AutoMed,它通过用改进的 MobileNet v3 骨干网络替换原始提示编码器来实现医学图像的自动分割。在多个数据集上的性能均超过了 SAM 和 SAM-Med2D。目前,对大型视觉模型 SAM 的增强缺乏在医学图像分类领域的应用。因此,我们引入了 SAM-MedCls,它将 SAM-Med2D 的编码器与我们设计的注意力模块相结合,构建了一个端到端的医学图像分类模型。它在各种模态的数据集上表现良好,甚至达到了最先进的结果,表明它有潜力成为医学图像分类的通用模型。