Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Zacatecas, Zacatecas, México.
PeerJ. 2023 Mar 16;11:e14806. doi: 10.7717/peerj.14806. eCollection 2023.
The gastrointestinal (GI) tract can be affected by different diseases or lesions such as esophagitis, ulcers, hemorrhoids, and polyps, among others. Some of them can be precursors of cancer such as polyps. Endoscopy is the standard procedure for the detection of these lesions. The main drawback of this procedure is that the diagnosis depends on the expertise of the doctor. This means that some important findings may be missed. In recent years, this problem has been addressed by deep learning (DL) techniques. Endoscopic studies use digital images. The most widely used DL technique for image processing is the convolutional neural network (CNN) due to its high accuracy for modeling complex phenomena. There are different CNNs that are characterized by their architecture. In this article, four architectures are compared: AlexNet, DenseNet-201, Inception-v3, and ResNet-101. To determine which architecture best classifies GI tract lesions, a set of metrics; accuracy, precision, sensitivity, specificity, F1-score, and area under the curve (AUC) were used. These architectures were trained and tested on the HyperKvasir dataset. From this dataset, a total of 6,792 images corresponding to 10 findings were used. A transfer learning approach and a data augmentation technique were applied. The best performing architecture was DenseNet-201, whose results were: 97.11% of accuracy, 96.3% sensitivity, 99.67% specificity, and 95% AUC.
胃肠道(GI)tract 可能会受到不同疾病或病变的影响,例如食管炎、溃疡、痔疮和息肉等。其中一些可能是癌症的前兆,例如息肉。内窥镜检查是检测这些病变的标准程序。该程序的主要缺点是诊断取决于医生的专业知识。这意味着可能会错过一些重要的发现。近年来,深度学习(DL)技术已经解决了这个问题。内窥镜研究使用数字图像。用于图像处理的最广泛使用的 DL 技术是卷积神经网络(CNN),因为它对建模复杂现象具有很高的准确性。有不同的 CNN,其特点是其架构。在本文中,比较了四种架构:AlexNet、DenseNet-201、Inception-v3 和 ResNet-101。为了确定哪种架构最能对胃肠道病变进行分类,使用了一组度量标准;准确性、精度、灵敏度、特异性、F1 分数和曲线下面积(AUC)。这些架构在 HyperKvasir 数据集上进行了训练和测试。从这个数据集中,总共使用了 6792 张对应于 10 种发现的图像。应用了迁移学习方法和数据增强技术。表现最好的架构是 DenseNet-201,其结果是:准确性为 97.11%、灵敏度为 96.3%、特异性为 99.67%和 AUC 为 95%。