Suppr超能文献

M2Trans:用于超声图像超分辨率的多模态正则化粗到细Transformer

M2Trans: Multi-Modal Regularized Coarse-to-Fine Transformer for Ultrasound Image Super-Resolution.

作者信息

Ni Zhangkai, Xiao Runyu, Yang Wenhan, Wang Hanli, Wang Zhihua, Xiang Lihua, Sun Liping

出版信息

IEEE J Biomed Health Inform. 2025 May;29(5):3112-3123. doi: 10.1109/JBHI.2024.3454068. Epub 2025 May 6.

Abstract

Ultrasound image super-resolution (SR) aims to transform low-resolution images into high-resolution ones, thereby restoring intricate details crucial for improved diagnostic accuracy. However, prevailing methods relying solely on image modality guidance and pixel-wise loss functions struggle to capture the distinct characteristics of medical images, such as unique texture patterns and specific colors harboring critical diagnostic information. To overcome these challenges, this paper introduces the Multi-Modal Regularized Coarse-to-fine Transformer (M2Trans) for Ultrasound Image SR. By integrating the text modality, we establish joint image-text guidance during training, leveraging the medical CLIP model to incorporate richer priors from text descriptions into the SR optimization process, enhancing detail, structure, and semantic recovery. Furthermore, we propose a novel coarse-to-fine transformer comprising multiple branches infused with self-attention and frequency transforms to efficiently capture signal dependencies across different scales. Extensive experimental results demonstrate significant improvements over state-of-the-art methods on benchmark datasets, including CCA-US, US-CASE, and our newly created dataset MMUS1K, with a minimum improvement of 0.17dB, 0.30dB, and 0.28dB in terms of PSNR.

摘要

超声图像超分辨率(SR)旨在将低分辨率图像转换为高分辨率图像,从而恢复对于提高诊断准确性至关重要的复杂细节。然而,仅依赖图像模态引导和逐像素损失函数的现有方法难以捕捉医学图像的独特特征,例如具有关键诊断信息的独特纹理模式和特定颜色。为了克服这些挑战,本文介绍了用于超声图像SR的多模态正则化粗到细Transformer(M2Trans)。通过整合文本模态,我们在训练期间建立联合图像-文本引导,利用医学CLIP模型将来自文本描述的更丰富先验纳入SR优化过程,增强细节、结构和语义恢复。此外,我们提出了一种新颖的粗到细Transformer,它由多个注入自注意力和频率变换的分支组成,以有效捕捉不同尺度上的信号依赖性。大量实验结果表明,在基准数据集(包括CCA-US、US-CASE和我们新创建的数据集MMUS1K)上,与现有方法相比有显著改进,在PSNR方面的最小改进分别为0.17dB、0.30dB和0.28dB。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验