Hejrati Behzad, Banerjee Soumyanil, Glide-Hurst Carri, Dong Ming
Department of Computer Science, Wayne State University, Detroit, MI, USA.
Department of Human Oncology, University of Wisconsin-Madison, Madison, WI, USA.
Med Image Comput Comput Assist Interv. 2024 Oct;15009:202-212. doi: 10.1007/978-3-031-72114-4_20. Epub 2024 Oct 3.
Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.
扩散模型已被广泛应用于高质量图像和视频生成任务。在本文中,我们提出了一种用于医学图像分割的新型条件扩散模型,即带空间注意力和潜在嵌入的模型(cDAL)。在cDAL中,基于卷积神经网络(CNN)的鉴别器在扩散过程的每个时间步用于区分生成的标签和真实标签。基于鉴别器学习到的特征计算空间注意力图,以帮助cDAL在输入图像中对判别区域生成更准确的分割。此外,我们在模型的每一层中加入了随机潜在嵌入,以显著减少训练和采样时间步的数量,从而使其在图像分割方面比其他扩散模型快得多。我们将cDAL应用于3个公开可用的医学图像分割数据集(MoNuSeg、胸部X光和海马体),并观察到与最先进算法相比,在定性和定量方面都有显著改进,具有更高的Dice分数和平均交并比(mIoU)。源代码可在https://github.com/Hejrati/cDAL/上公开获取。