用于增强无线胶囊内窥镜图像中胃肠道异常识别的视觉Transformer蒸馏

Vision transformer distillation for enhanced gastrointestinal abnormality recognition in wireless capsule endoscopy images.

作者信息

Oukdach Yassine, Garbaz Anass, Kerkaou Zakaria, El Ansari Mohamed, Koutti Lahcen, Papachrysos Nikolaos, El Ouafdi Ahmed Fouad, de Lange Thomas, Distante Cosimo

机构信息

Ibn Zohr University, LabSIV, Department of Computer Science, Faculty of Sciences, Agadir, Morocco.

Moulay Ismail University, Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Meknes, Morocco.

出版信息

J Med Imaging (Bellingham). 2025 Jan;12(1):014505. doi: 10.1117/1.JMI.12.1.014505. Epub 2025 Feb 5.

DOI:10.1117/1.JMI.12.1.014505

PMID:39916992

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11796471/

Abstract

PURPOSE

Wireless capsule endoscopy (WCE) is a non-invasive technology used for diagnosing gastrointestinal abnormalities. A single examination generates images, making manual review both time-consuming and costly for doctors. Therefore, the development of computer vision-assisted systems is highly desirable to aid in the diagnostic process.

APPROACH

We presents a deep learning approach leveraging knowledge distillation (KD) from a convolutional neural network (CNN) teacher model to a vision transformer (ViT) student model for gastrointestinal abnormality recognition. The CNN teacher model utilizes attention mechanisms and depth-wise separable convolutions to extract features from WCE images, supervising the ViT in learning these representations.

RESULTS

The proposed method achieves accuracy of 97% and 96% on the Kvasir and KID datasets, respectively, demonstrating its effectiveness in distinguishing normal from abnormal regions and bleeding from non-bleeding cases. The proposed approach offers computational efficiency and generalization to unseen datasets, outperforming several state-of-the-art methods.

CONCLUSIONS

We proposed a deep learning approach utilizing CNNs and a ViT with KD to effectively classify gastrointestinal diseases in WCE images. It demonstrates promising performance on public datasets, distinguishing normal from abnormal regions and bleeding from non-bleeding cases while offering optimal computational efficiency compared with existing methods, making it suitable for GI disease applications.

摘要

目的

无线胶囊内镜检查（WCE）是一种用于诊断胃肠道异常的非侵入性技术。单次检查会生成大量图像，这使得医生进行人工检查既耗时又昂贵。因此，非常需要开发计算机视觉辅助系统来辅助诊断过程。

方法

我们提出了一种深度学习方法，利用知识蒸馏（KD）从卷积神经网络（CNN）教师模型到视觉Transformer（ViT）学生模型进行胃肠道异常识别。CNN教师模型利用注意力机制和深度可分离卷积从WCE图像中提取特征，指导ViT学习这些表示。

结果

所提出的方法在Kvasir和KID数据集上分别达到了97%和96%的准确率，证明了其在区分正常区域和异常区域以及出血和非出血病例方面的有效性。所提出的方法具有计算效率和对未见数据集的泛化能力，优于几种先进方法。

结论

我们提出了一种利用CNN和带有KD的ViT的深度学习方法，以有效地对WCE图像中的胃肠道疾病进行分类。它在公共数据集上表现出了有前景的性能，能够区分正常区域和异常区域以及出血和非出血病例，同时与现有方法相比提供了最佳的计算效率，使其适用于胃肠道疾病应用。

相似文献

Vision transformer distillation for enhanced gastrointestinal abnormality recognition in wireless capsule endoscopy images.用于增强无线胶囊内窥镜图像中胃肠道异常识别的视觉Transformer蒸馏

J Med Imaging (Bellingham). 2025 Jan;12(1):014505. doi: 10.1117/1.JMI.12.1.014505. Epub 2025 Feb 5.

Convolution neural network for the diagnosis of wireless capsule endoscopy: a systematic review and meta-analysis.卷积神经网络在无线胶囊内镜诊断中的应用：系统评价和荟萃分析。

Surg Endosc. 2022 Jan;36(1):16-31. doi: 10.1007/s00464-021-08689-3. Epub 2021 Aug 23.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。

Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.

Classification of pediatric video capsule endoscopy images for small bowel abnormalities using deep learning models.使用深度学习模型对小儿小肠异常的视频胶囊内镜图像进行分类

World J Gastroenterol. 2025 Jun 7;31(21):107601. doi: 10.3748/wjg.v31.i21.107601.

A novel deep learning framework for retinal disease detection leveraging contextual and local features cues from retinal images.一种用于视网膜疾病检测的新型深度学习框架，利用来自视网膜图像的上下文和局部特征线索。

Med Biol Eng Comput. 2025 Feb 7. doi: 10.1007/s11517-025-03314-0.

Artificial intelligence-based prediction of organ involvement in Sjogren's syndrome using labial gland biopsy whole-slide images.基于人工智能利用唇腺活检全切片图像预测干燥综合征的器官受累情况。

Clin Rheumatol. 2025 Jun 5. doi: 10.1007/s10067-025-07518-5.

A fake news detection model using the integration of multimodal attention mechanism and residual convolutional network.一种融合多模态注意力机制和残差卷积网络的假新闻检测模型。

Sci Rep. 2025 Jul 1;15(1):20544. doi: 10.1038/s41598-025-05702-w.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

Enhanced Maize Leaf Disease Detection and Classification Using an Integrated CNN-ViT Model.使用集成的卷积神经网络-视觉Transformer模型增强玉米叶部病害检测与分类

Food Sci Nutr. 2025 Jun 30;13(7):e70513. doi: 10.1002/fsn3.70513. eCollection 2025 Jul.

本文引用的文献

UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images.UViT-Seg：一种基于 ViT 和 U-Net 的高效框架，用于在结肠镜和 WCE 图像中进行准确的结直肠息肉分割。

J Imaging Inform Med. 2024 Oct;37(5):2354-2374. doi: 10.1007/s10278-024-01124-8. Epub 2024 Apr 26.

PDAtt-Unet: Pyramid Dual-Decoder Attention Unet for Covid-19 infection segmentation from CT-scans.PDAtt-Unet：用于从 CT 扫描中分割新冠感染的金字塔双解码器注意 U 型网络。

Med Image Anal. 2023 May;86:102797. doi: 10.1016/j.media.2023.102797. Epub 2023 Mar 21.

Colorectal cancer statistics, 2023.2023 年结直肠癌统计数据。

CA Cancer J Clin. 2023 May-Jun;73(3):233-254. doi: 10.3322/caac.21772. Epub 2023 Mar 1.

Cancer statistics, 2023.癌症统计数据，2023 年。

CA Cancer J Clin. 2023 Jan;73(1):17-48. doi: 10.3322/caac.21763.

A deep CNN model for anomaly detection and localization in wireless capsule endoscopy images.一种用于无线胶囊内窥镜图像中异常检测和定位的深度卷积神经网络模型。

Comput Biol Med. 2021 Oct;137:104789. doi: 10.1016/j.compbiomed.2021.104789. Epub 2021 Aug 25.

Kvasir-Capsule, a video capsule endoscopy dataset.卡瓦西胶囊内镜数据集

Sci Data. 2021 May 27;8(1):142. doi: 10.1038/s41597-021-00920-z.

Deep transfer learning approaches for bleeding detection in endoscopy images.深度学习方法在内镜图像出血检测中的应用。

Comput Med Imaging Graph. 2021 Mar;88:101852. doi: 10.1016/j.compmedimag.2020.101852. Epub 2021 Jan 19.

The Role of the Radiologist in Determining Disease Severity in Inflammatory Bowel Diseases.放射科医生在确定炎症性肠病疾病严重程度中的作用。

Gastrointest Endosc Clin N Am. 2019 Jul;29(3):447-470. doi: 10.1016/j.giec.2019.02.006. Epub 2019 Apr 5.

Bleeding Detection in Wireless Capsule Endoscopy Image Video Using Superpixel-Color Histogram and a Subspace KNN Classifier.基于超像素-颜色直方图和子空间K近邻分类器的无线胶囊内窥镜图像视频中的出血检测

Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513012.

Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.全球癌症统计数据 2018：GLOBOCAN 对全球 185 个国家/地区 36 种癌症的发病率和死亡率的估计。

CA Cancer J Clin. 2018 Nov;68(6):394-424. doi: 10.3322/caac.21492. Epub 2018 Sep 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验