Suppr超能文献

多尺度双通道特征嵌入解码器在生物医学图像分割中的应用。

Multi-scale dual-channel feature embedding decoder for biomedical image segmentation.

机构信息

Department of Computer Science and Engineering, National Institute of Technology, Durgapur 713209, West Bengal, India.

Department of Information Technology, Sikkim Manipal Institute of Technology, Sikkim Manipal University, India.

出版信息

Comput Methods Programs Biomed. 2024 Dec;257:108464. doi: 10.1016/j.cmpb.2024.108464. Epub 2024 Oct 18.

Abstract

BACKGROUND AND OBJECTIVE

Attaining global context along with local dependencies is of paramount importance for achieving highly accurate segmentation of objects from image frames and is challenging while developing deep learning-based biomedical image segmentation. Several transformer-based models have been proposed to handle this issue in biomedical image segmentation. Despite this, segmentation accuracy remains an ongoing challenge, as these models often fall short of the target range due to their limited capacity to capture critical local and global contexts. However, the quadratic computational complexity is the main limitation of these models. Moreover, a large dataset is required to train those models.

METHODS

In this paper, we propose a novel multi-scale dual-channel decoder to mitigate this issue. The complete segmentation model uses two parallel encoders and a dual-channel decoder. The encoders are based on convolutional networks, which capture the features of the input images at multiple levels and scales. The decoder comprises a hierarchy of Attention-gated Swin Transformers with a fine-tuning strategy. The hierarchical Attention-gated Swin Transformers implements a multi-scale, multi-level feature embedding strategy that captures short and long-range dependencies and leverages the necessary features without increasing computational load. At the final stage of the decoder, a fine-tuning strategy is implemented that refines the features to keep the rich features and reduce the possibility of over-segmentation.

RESULTS

The proposed model is evaluated on publicly available LiTS, 3DIRCADb, and spleen datasets obtained from Medical Segmentation Decathlon. The model is also evaluated on a private dataset from Medical College Kolkata, India. We observe that the proposed model outperforms the state-of-the-art models in liver tumor and spleen segmentation in terms of evaluation metrics at a comparative computational cost.

CONCLUSION

The novel dual-channel decoder embeds multi-scale features and creates a representation of both short and long-range contexts efficiently. It also refines the features at the final stage to select only necessary features. As a result, we achieve better segmentation performance than the state-of-the-art models.

摘要

背景与目的

在开发基于深度学习的生物医学图像分割时,获取全局上下文并兼顾局部依赖性对于实现对图像帧中对象的高精度分割至关重要。已经提出了几种基于转换器的模型来解决这个问题在生物医学图像分割中。尽管如此,由于这些模型捕获关键局部和全局上下文的能力有限,分割准确性仍然是一个持续存在的挑战,因为它们通常无法达到目标范围。然而,二次计算复杂度是这些模型的主要限制。此外,这些模型需要大量数据集进行训练。

方法

在本文中,我们提出了一种新的多尺度双通道解码器来解决这个问题。完整的分割模型使用两个并行编码器和一个双通道解码器。编码器基于卷积网络,可在多个级别和尺度上捕获输入图像的特征。解码器由具有微调策略的层次结构 Attention-gated Swin Transformers 组成。层次结构 Attention-gated Swin Transformers 实现了一种多尺度、多层次的特征嵌入策略,该策略可捕获短程和长程依赖关系,并利用必要的特征而不会增加计算负载。在解码器的最后阶段,实现了一种微调策略,该策略细化了特征,保留了丰富的特征并降低了过度分割的可能性。

结果

所提出的模型在公开的 LiTS、3DIRCADb 和来自 Medical Segmentation Decathlon 的脾脏数据集上进行了评估。该模型还在印度加尔各答医学院的私人数据集上进行了评估。我们观察到,在所比较的计算成本下,所提出的模型在肝脏肿瘤和脾脏分割方面的评估指标优于最先进的模型。

结论

新的双通道解码器嵌入多尺度特征,并有效地创建短程和长程上下文的表示。它还在最后阶段细化特征,仅选择必要的特征。因此,我们实现了比最先进模型更好的分割性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验