Suppr超能文献

用于增强无线胶囊内窥镜图像中胃肠道异常识别的视觉Transformer蒸馏

Vision transformer distillation for enhanced gastrointestinal abnormality recognition in wireless capsule endoscopy images.

作者信息

Oukdach Yassine, Garbaz Anass, Kerkaou Zakaria, El Ansari Mohamed, Koutti Lahcen, Papachrysos Nikolaos, El Ouafdi Ahmed Fouad, de Lange Thomas, Distante Cosimo

机构信息

Ibn Zohr University, LabSIV, Department of Computer Science, Faculty of Sciences, Agadir, Morocco.

Moulay Ismail University, Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Meknes, Morocco.

出版信息

J Med Imaging (Bellingham). 2025 Jan;12(1):014505. doi: 10.1117/1.JMI.12.1.014505. Epub 2025 Feb 5.

Abstract

PURPOSE

Wireless capsule endoscopy (WCE) is a non-invasive technology used for diagnosing gastrointestinal abnormalities. A single examination generates images, making manual review both time-consuming and costly for doctors. Therefore, the development of computer vision-assisted systems is highly desirable to aid in the diagnostic process.

APPROACH

We presents a deep learning approach leveraging knowledge distillation (KD) from a convolutional neural network (CNN) teacher model to a vision transformer (ViT) student model for gastrointestinal abnormality recognition. The CNN teacher model utilizes attention mechanisms and depth-wise separable convolutions to extract features from WCE images, supervising the ViT in learning these representations.

RESULTS

The proposed method achieves accuracy of 97% and 96% on the Kvasir and KID datasets, respectively, demonstrating its effectiveness in distinguishing normal from abnormal regions and bleeding from non-bleeding cases. The proposed approach offers computational efficiency and generalization to unseen datasets, outperforming several state-of-the-art methods.

CONCLUSIONS

We proposed a deep learning approach utilizing CNNs and a ViT with KD to effectively classify gastrointestinal diseases in WCE images. It demonstrates promising performance on public datasets, distinguishing normal from abnormal regions and bleeding from non-bleeding cases while offering optimal computational efficiency compared with existing methods, making it suitable for GI disease applications.

摘要

目的

无线胶囊内镜检查(WCE)是一种用于诊断胃肠道异常的非侵入性技术。单次检查会生成大量图像,这使得医生进行人工检查既耗时又昂贵。因此,非常需要开发计算机视觉辅助系统来辅助诊断过程。

方法

我们提出了一种深度学习方法,利用知识蒸馏(KD)从卷积神经网络(CNN)教师模型到视觉Transformer(ViT)学生模型进行胃肠道异常识别。CNN教师模型利用注意力机制和深度可分离卷积从WCE图像中提取特征,指导ViT学习这些表示。

结果

所提出的方法在Kvasir和KID数据集上分别达到了97%和96%的准确率,证明了其在区分正常区域和异常区域以及出血和非出血病例方面的有效性。所提出的方法具有计算效率和对未见数据集的泛化能力,优于几种先进方法。

结论

我们提出了一种利用CNN和带有KD的ViT的深度学习方法,以有效地对WCE图像中的胃肠道疾病进行分类。它在公共数据集上表现出了有前景的性能,能够区分正常区域和异常区域以及出血和非出血病例,同时与现有方法相比提供了最佳的计算效率,使其适用于胃肠道疾病应用。

相似文献

本文引用的文献

3
Colorectal cancer statistics, 2023.2023 年结直肠癌统计数据。
CA Cancer J Clin. 2023 May-Jun;73(3):233-254. doi: 10.3322/caac.21772. Epub 2023 Mar 1.
4
Cancer statistics, 2023.癌症统计数据,2023 年。
CA Cancer J Clin. 2023 Jan;73(1):17-48. doi: 10.3322/caac.21763.
6
Kvasir-Capsule, a video capsule endoscopy dataset.卡瓦西胶囊内镜数据集
Sci Data. 2021 May 27;8(1):142. doi: 10.1038/s41597-021-00920-z.
7
Deep transfer learning approaches for bleeding detection in endoscopy images.深度学习方法在内镜图像出血检测中的应用。
Comput Med Imaging Graph. 2021 Mar;88:101852. doi: 10.1016/j.compmedimag.2020.101852. Epub 2021 Jan 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验