Hussain Muhammad Sajjad, Asgher Umer, Nisar Sajid, Socha Vladimir, Shaukat Arslan, Wang Jinhui, Feng Tian, Paracha Rehan Zafar, Khan Muhammad Ali
Department of Computer Science, Sir Syed (CASE) Institute of Technology, Islamabad, Pakistan.
Laboratory of Human Factors and Automation in Aviation, Department of Air Transport, Faculty of Transportation Sciences, Czech Technical University in Prague (CTU), Prague, Czechia.
Front Robot AI. 2024 Aug 9;11:1387491. doi: 10.3389/frobt.2024.1387491. eCollection 2024.
Colonoscopy is a reliable diagnostic method to detect colorectal polyps early on and prevent colorectal cancer. The current examination techniques face a significant challenge of high missed rates, resulting in numerous undetected polyps and irregularities. Automated and real-time segmentation methods can help endoscopists to segment the shape and location of polyps from colonoscopy images in order to facilitate clinician's timely diagnosis and interventions. Different parameters like shapes, small sizes of polyps, and their close resemblance to surrounding tissues make this task challenging. Furthermore, high-definition image quality and reliance on the operator make real-time and accurate endoscopic image segmentation more challenging. Deep learning models utilized for segmenting polyps, designed to capture diverse patterns, are becoming progressively complex. This complexity poses challenges for real-time medical operations. In clinical settings, utilizing automated methods requires the development of accurate, lightweight models with minimal latency, ensuring seamless integration with endoscopic hardware devices. To address these challenges, in this study a novel lightweight and more generalized Enhanced Nanonet model, an improved version of Nanonet using NanonetB for real-time and precise colonoscopy image segmentation, is proposed. The proposed model enhances the performance of Nanonet using Nanonet B on the overall prediction scheme by applying data augmentation, Conditional Random Field (CRF), and Test-Time Augmentation (TTA). Six publicly available datasets are utilized to perform thorough evaluations, assess generalizability, and validate the improvements: Kvasir-SEG, Endotect Challenge 2020, Kvasir-instrument, CVC-ClinicDB, CVC-ColonDB, and CVC-300. Through extensive experimentation, using the Kvasir-SEG dataset, our model achieves a mIoU score of 0.8188 and a Dice coefficient of 0.8060 with only 132,049 parameters and employing minimal computational resources. A thorough cross-dataset evaluation was performed to assess the generalization capability of the proposed Enhanced Nanonet model across various publicly available polyp datasets for potential real-world applications. The result of this study shows that using CRF (Conditional Random Fields) and TTA (Test-Time Augmentation) enhances performance within the same dataset and also across diverse datasets with a model size of just 132,049 parameters. Also, the proposed method indicates improved results in detecting smaller and sessile polyps (flats) that are significant contributors to the high miss rates.
结肠镜检查是早期检测结直肠息肉并预防结直肠癌的可靠诊断方法。当前的检查技术面临着高漏诊率这一重大挑战,导致大量息肉和病变未被检测到。自动实时分割方法可以帮助内镜医师从结肠镜检查图像中分割出息肉的形状和位置,以便于临床医生及时进行诊断和干预。息肉的不同参数,如形状、小尺寸以及它们与周围组织的高度相似性,使得这项任务具有挑战性。此外,高清图像质量以及对操作者的依赖使得实时准确的内镜图像分割更具挑战性。用于分割息肉的深度学习模型旨在捕捉各种模式,正变得越来越复杂。这种复杂性给实时医疗操作带来了挑战。在临床环境中,使用自动化方法需要开发准确、轻量级且延迟最小的模型,以确保与内镜硬件设备无缝集成。为应对这些挑战,在本研究中,提出了一种新颖的轻量级且更通用的增强型纳米网络模型,它是使用纳米网络B的纳米网络的改进版本,用于实时精确的结肠镜检查图像分割。所提出的模型通过应用数据增强、条件随机场(CRF)和测试时增强(TTA),在整体预测方案上提高了纳米网络B的性能。使用六个公开可用的数据集进行了全面评估、评估通用性并验证改进效果:Kvasir - SEG、2020年内窥镜检测挑战赛、Kvasir - instrument、CVC - ClinicDB、CVC - ColonDB和CVC - 300。通过广泛实验,使用Kvasir - SEG数据集,我们的模型仅用132,049个参数并使用最少的计算资源,就实现了0.8188的平均交并比(mIoU)分数和0.8060的骰子系数。进行了全面的跨数据集评估,以评估所提出的增强型纳米网络模型在各种公开可用的息肉数据集上对于潜在实际应用的泛化能力。本研究结果表明,使用CRF(条件随机场)和TTA(测试时增强)在同一数据集内以及跨不同数据集都提高了性能,且模型大小仅为132,049个参数。此外,所提出的方法在检测对高漏诊率有重大影响的较小和无蒂息肉(扁平息肉)方面显示出改进的结果。