Suppr超能文献

InCoLoTransNet:一种用于胃肠道图像中精确结直肠息肉分割的卷积与局部注意力感知Transformer

InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images.

作者信息

Oukdach Yassine, Garbaz Anass, Kerkaou Zakaria, Ansari Mohamed El, Koutti Lahcen, Ouafdi Ahmed Fouad El, Salihoun Mouna

机构信息

LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco.

Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Moulay Ismail University, B.P 11201, Meknès, 52000, Morocco.

出版信息

J Imaging Inform Med. 2025 Jan 17. doi: 10.1007/s10278-025-01389-7.

Abstract

Gastrointestinal (GI) disease examination presents significant challenges to doctors due to the intricate structure of the human digestive system. Colonoscopy and wireless capsule endoscopy are the most commonly used tools for GI examination. However, the large amount of data generated by these technologies requires the expertise and intervention of doctors for disease identification, making manual analysis a very time-consuming task. Thus, the development of a computer-assisted system is highly desirable to assist clinical professionals in making decisions in a low-cost and effective way. In this paper, we introduce a novel framework called InCoLoTransNet, designed for polyp segmentation. The study is based on a transformer and convolution-involution neural network, following the encoder-decoder architecture. We employed the vision transformer in the encoder section to focus on the global context, while the decoder involves a convolution-involution collaboration for resampling the polyp features. Involution enhances the model's ability to adaptively capture spatial and contextual information, while convolution focuses on local information, leading to more accurate feature extraction. The essential features captured by the transformer encoder are passed to the decoder through two skip connection pathways. The CBAM module refines the features and passes them to the convolution block, leveraging attention mechanisms to emphasize relevant information. Meanwhile, locality self-attention is employed to pass essential features to the involution block, reinforcing the model's ability to capture more global features in the polyp regions. Experiments were conducted on five public datasets: CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, Etis-LaribPolypDB, and CVC-300. The results obtained by InCoLoTransNet are optimal when compared with 15 state-of-the-art methods for polyp segmentation, achieving the highest mean dice score of 93% on CVC-ColonDB and 90% on mean intersection over union, outperforming the state-of-the-art methods. Additionally, InCoLoTransNet distinguishes itself in terms of polyp segmentation generalization performance. It achieved high scores in mean dice coefficient and mean intersection over union on unseen datasets as follows: 85% and 79% on CVC-ColonDB, 91% and 87% on CVC-300, and 79% and 70% on Etis-LaribPolypDB, respectively.

摘要

由于人体消化系统结构复杂,胃肠道(GI)疾病检查给医生带来了重大挑战。结肠镜检查和无线胶囊内镜检查是胃肠道检查中最常用的工具。然而,这些技术产生的大量数据需要医生具备专业知识并进行干预才能识别疾病,这使得人工分析成为一项非常耗时的任务。因此,非常需要开发一种计算机辅助系统,以低成本、高效的方式协助临床专业人员做出决策。在本文中,我们介绍了一种名为InCoLoTransNet的新颖框架,用于息肉分割。该研究基于变压器和卷积-反卷积神经网络,采用编码器-解码器架构。我们在编码器部分使用视觉变压器来关注全局上下文,而解码器则涉及卷积-反卷积协作以对息肉特征进行重采样。反卷积增强了模型自适应捕获空间和上下文信息的能力,而卷积则专注于局部信息,从而实现更准确的特征提取。变压器编码器捕获的基本特征通过两条跳跃连接路径传递到解码器。CBAM模块对特征进行细化并将其传递到卷积块,利用注意力机制强调相关信息。同时,采用局部自注意力将基本特征传递到反卷积块,增强模型在息肉区域捕获更多全局特征的能力。我们在五个公共数据集上进行了实验:CVC-ClinicDB、CVC-ColonDB、Kvasir-SEG、Etis-LaribPolypDB和CVC-300。与15种用于息肉分割的最先进方法相比,InCoLoTransNet获得的结果是最优的,在CVC-ColonDB上实现了93%的最高平均骰子分数,在平均交并比上达到90%,优于最先进的方法。此外,InCoLoTransNet在息肉分割泛化性能方面表现出色。它在未见过的数据集上的平均骰子系数和平均交并比方面取得了高分,如下所示:在CVC-ColonDB上分别为85%和79%,在CVC-300上为91%和87%,在Etis-LaribPolypDB上为79%和70%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验