Department of Computer Science and Information Engineering, National University of Kaohsiung, No. 700, Kaohsiung University Rd., Nan-Tzu Dist., 811, Kaohsiung, Taiwan.
J Digit Imaging. 2022 Oct;35(5):1101-1110. doi: 10.1007/s10278-022-00627-6. Epub 2022 Apr 27.
To visualise the tumours inside the body on a screen, a long and thin tube is inserted with a light source and a camera at the tip to obtain video frames inside organs in endoscopy. However, multiple artefacts exist in these video frames that cause difficulty during the diagnosis of cancers. In this research, deep learning was applied to detect eight kinds of artefacts: specularity, bubbles, saturation, contrast, blood, instrument, blur, and imaging artefacts. Based on transfer learning with pre-trained parameters and fine-tuning, two state-of-the-art methods were applied for detection: faster region-based convolutional neural networks (Faster R-CNN) and EfficientDet. Experiments were implemented on the grand challenge dataset, Endoscopy Artefact Detection and Segmentation (EAD2020). To validate our approach in this study, we used phase I of 2,200 frames and phase II of 331 frames in the original training dataset with ground-truth annotations as training and testing dataset, respectively. Among the tested methods, EfficientDet-D2 achieves a score of 0.2008 (mAP[Formula: see text]0.6+mIoU[Formula: see text]0.4) on the dataset that is better than three other baselines: Faster-RCNN, YOLOv3, and RetinaNet, and competitive to the best non-baseline result scored 0.25123 on the leaderboard although our testing was on phase II of 331 frames instead of the original 200 testing frames. Without extra improvement techniques beyond basic neural networks such as test-time augmentation, we showed that a simple baseline could achieve state-of-the-art performance in detecting artefacts in endoscopy. In conclusion, we proposed the combination of EfficientDet-D2 with suitable data augmentation and pre-trained parameters during fine-tuning training to detect the artefacts in endoscopy.
为了在屏幕上可视化体内的肿瘤,将一根又长又细的管子插入体内,管子的尖端带有光源和摄像头,以便在内窥镜中获取器官内的视频帧。然而,这些视频帧中存在多种伪影,这给癌症的诊断带来了困难。在这项研究中,深度学习被应用于检测八种伪影:镜面反射、气泡、饱和度、对比度、血液、器械、模糊和成像伪影。基于带有预训练参数的迁移学习和微调,应用了两种最先进的方法进行检测:更快的区域卷积神经网络(Faster R-CNN)和 EfficientDet。实验是在内窥镜伪影检测和分割(EAD2020)的大型挑战数据集上进行的。为了验证我们在本研究中的方法,我们分别使用原始训练数据集的第 I 阶段的 2200 个帧和第 II 阶段的 331 个帧以及带有地面实况注释的训练数据集和测试数据集。在所测试的方法中,EfficientDet-D2 在该数据集上的得分为 0.2008(mAP[Formula: see text]0.6+mIoU[Formula: see text]0.4),优于其他三个基线:Faster-RCNN、YOLOv3 和 RetinaNet,并且尽管我们的测试是在第 II 阶段的 331 个帧上进行的,而不是原始的 200 个测试帧,但与在排行榜上得分 0.25123 的最佳非基线结果相当。除了基本神经网络之外,我们没有使用额外的改进技术,例如测试时增强,我们表明,一个简单的基线可以在检测内窥镜中的伪影方面达到最先进的性能。总之,我们提出了在微调训练期间将 EfficientDet-D2 与合适的数据增强和预训练参数相结合来检测内窥镜中的伪影。