使用卷积视觉变换器（ConvVit）混合模型对内镜图像进行增强的胃肠道疾病分类

Enhanced gastrointestinal disease classification using a convvit hybrid model on endoscopic images.

作者信息

Utku Anıl

机构信息

Computer Engineering, Faculty of Engineering, Munzur University, Tunceli, Turkey.

出版信息

Phys Eng Sci Med. 2025 Jul 21. doi: 10.1007/s13246-025-01600-7.

DOI:10.1007/s13246-025-01600-7

PMID:40691412

Abstract

Endoscopy is a procedure that allows examination of the gastrointestinal system, including the stomach, esophagus, large intestine, and duodenum, with the help of an endoscope. Processing of endoscopic images is important for early detection and treatment of gastrointestinal diseases. In this study, hybrid ConvViT was developed using CNN and ViT to increase the classification accuracy of pathologies in gastrointestinal endoscopic images. CNNs are well-suited for capturing local spatial features through hierarchical convolutions, making them highly effective in detecting fine-grained textures and edge patterns. These capabilities complement the ViT's global attention mechanism, which excels at modeling long-range dependencies in images. The motivation of this study is to increase the classification accuracy and reliability with the ConvViT model, which was developed by combining the practical features of CNN and ViT models, which are individually successful in different aspects of image processing. The ConvViT model was compared with VGG-16, ResNet-50, Inception-V3 and ViT. Comparable models were tested using a gastrointestinal endoscopic image dataset containing ulcers, polyps, inflammation, bleeding, and regular anatomical features. Experiments showed that ConvViT had better prediction performance than compared models, with 95.87% classification accuracy.

摘要

内窥镜检查是一种借助内窥镜对胃肠系统进行检查的程序，胃肠系统包括胃、食管、大肠和十二指肠。内窥镜图像的处理对于胃肠道疾病的早期检测和治疗至关重要。在本研究中，使用卷积神经网络（CNN）和视觉Transformer（ViT）开发了混合ConvViT，以提高胃肠道内窥镜图像中病变的分类准确率。卷积神经网络非常适合通过分层卷积捕捉局部空间特征，使其在检测细粒度纹理和边缘模式方面非常有效。这些能力补充了视觉Transformer的全局注意力机制，该机制擅长对图像中的长距离依赖关系进行建模。本研究的动机是通过结合CNN和ViT模型的实际特征来提高ConvViT模型的分类准确率和可靠性，这两种模型在图像处理的不同方面都取得了成功。将ConvViT模型与VGG-16、ResNet-50、Inception-V3和ViT进行了比较。使用包含溃疡、息肉、炎症、出血和正常解剖特征的胃肠道内窥镜图像数据集对可比模型进行了测试。实验表明，ConvViT比其他比较模型具有更好的预测性能，分类准确率为95.87%。

相似文献

Enhanced gastrointestinal disease classification using a convvit hybrid model on endoscopic images.

Phys Eng Sci Med. 2025 Jul 21. doi: 10.1007/s13246-025-01600-7.

Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.

Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.

Enhanced Maize Leaf Disease Detection and Classification Using an Integrated CNN-ViT Model.

Food Sci Nutr. 2025 Jun 30;13(7):e70513. doi: 10.1002/fsn3.70513. eCollection 2025 Jul.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Systematic Review of Hybrid Vision Transformer Architectures for Radiological Image Analysis.

J Imaging Inform Med. 2025 Jan 27. doi: 10.1007/s10278-024-01322-4.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Surveillance of Barrett's oesophagus: exploring the uncertainty through systematic review, expert workshop and economic modelling.

Health Technol Assess. 2006 Mar;10(8):1-142, iii-iv. doi: 10.3310/hta10080.

Deep learning for fine-grained molecular-based colorectal cancer classification.

Transl Cancer Res. 2025 May 30;14(5):3035-3046. doi: 10.21037/tcr-2024-2348. Epub 2025 May 8.

Short-Term Memory Impairment

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

本文引用的文献

A Review of Application of Deep Learning in Endoscopic Image Processing.

J Imaging. 2024 Nov 1;10(11):275. doi: 10.3390/jimaging10110275.

The default network dominates neural responses to evolving movie stories.

Nat Commun. 2023 Jul 14;14(1):4197. doi: 10.1038/s41467-023-39862-y.

Deep learning-based prediction model for diagnosing gastrointestinal diseases using endoscopy images.

Int J Med Inform. 2023 Sep;177:105142. doi: 10.1016/j.ijmedinf.2023.105142. Epub 2023 Jul 5.

A state-of-the-art survey of artificial neural networks for Whole-slide Image analysis: From popular Convolutional Neural Networks to potential visual transformers.

Comput Biol Med. 2023 Jul;161:107034. doi: 10.1016/j.compbiomed.2023.107034. Epub 2023 May 23.

Intelligent Wireless Capsule Endoscopy for the Diagnosis of Gastrointestinal Diseases.

Diagnostics (Basel). 2023 Apr 17;13(8):1445. doi: 10.3390/diagnostics13081445.

A review of deep learning-based multiple-lesion recognition from medical images: classification, detection and segmentation.

Comput Biol Med. 2023 May;157:106726. doi: 10.1016/j.compbiomed.2023.106726. Epub 2023 Mar 1.

A New Approach for Gastrointestinal Tract Findings Detection and Classification: Deep Learning-Based Hybrid Stacking Ensemble Models.

Diagnostics (Basel). 2023 Feb 14;13(4):720. doi: 10.3390/diagnostics13040720.

An Overview of Deep-Learning-Based Methods for Cardiovascular Risk Assessment with Retinal Images.

Diagnostics (Basel). 2022 Dec 26;13(1):68. doi: 10.3390/diagnostics13010068.

Visual Transformers and Convolutional Neural Networks for Disease Classification on Radiographs: A Comparison of Performance, Sample Efficiency, and Hidden Stratification.

Radiol Artif Intell. 2022 Sep 21;4(6):e220012. doi: 10.1148/ryai.220012. eCollection 2022 Nov.

GestroNet: A Framework of Saliency Estimation and Optimal Deep Learning Features Based Gastrointestinal Diseases Detection and Classification.

Diagnostics (Basel). 2022 Nov 7;12(11):2718. doi: 10.3390/diagnostics12112718.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用卷积视觉变换器（ConvVit）混合模型对内镜图像进行增强的胃肠道疾病分类

Enhanced gastrointestinal disease classification using a convvit hybrid model on endoscopic images.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献