Baldini Chiara, Migliorelli Lucia, Berardini Daniele, Azam Muhammad Adeel, Sampieri Claudio, Ioppi Alessandro, Srivastava Rakesh, Peretti Giorgio, Mattos Leonardo S
Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genova, Italy; Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, Genova, Italy.
Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genova, Italy; Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy.
Comput Methods Programs Biomed. 2025 Mar;260:108539. doi: 10.1016/j.cmpb.2024.108539. Epub 2024 Dec 13.
Laryngeal Cancer (LC) constitutes approximately one third of head and neck cancers. Detecting early-stage lesions in this anatomical region is crucial for achieving a high survival rate. However, it poses significant diagnostic challenges owing to the varied appearance of lesions and the need for precise characterization for appropriate clinical management. Conventional diagnostic approaches rely heavily on endoscopic examination, which often requires expert interpretation and may be limited by subjective assessment. Deep learning (DL) approaches offer promising opportunities for automating lesion detection, but their efficacy in handling multi-modal imaging data and accurately localizing small lesions remains a subject of investigation. Furthermore, the clinical domain may largely benefit from the deployment of efficient DL methods that can ensure equitable access to advanced technologies, regardless of the availability of resources that can often be limited. In this study, a DL-based approach, named SRE-YOLO, was introduced to provide real-time assistance to less-experienced personnel during laryngeal assessment, by automatically detecting lesions at different scales from endoscopic White Light (WL) and Narrow-Band Imaging (NBI) images.
During the training, the SRE-YOLO integrates a YOLOv8 nano (YOLOv8n) baseline with a Super-Resolution (SR) branch to enhance lesion detection. This last component is decoupled during inference to preserve the low computational demand of the YOLOv8n baseline. The evaluation was conducted on a multi-center dataset, encompassing diverse laryngeal pathologies and acquisition modalities.
The SRE-YOLO method improved the Average Precision (AP) in lesion detection by 5% with respect to the YOLOv8n baseline, while maintaining the inference speed of 58.8 Frames Per Second (FPS). Comparative analyses against state-of-the-art DL methods highlighted the efficacy of the SRE-YOLO approach in balancing detection accuracy, computational efficiency, and real-time applicability.
This research underscores the potential of SRE-YOLO in developing efficient DL-driven decision support systems for real-time detection of laryngeal lesions at different scales from both WL and NBI endoscopic data.
喉癌(LC)约占头颈癌的三分之一。在这个解剖区域检测早期病变对于实现高生存率至关重要。然而,由于病变外观多样且需要进行精确特征描述以进行适当的临床管理,这带来了重大的诊断挑战。传统的诊断方法严重依赖内镜检查,这通常需要专家解读,并且可能受到主观评估的限制。深度学习(DL)方法为病变检测自动化提供了有前景的机会,但其在处理多模态成像数据和准确定位小病变方面的有效性仍有待研究。此外,临床领域可能会从高效DL方法的应用中大大受益,这些方法可以确保无论资源通常可能有限的情况下,都能公平地获得先进技术。在本研究中,引入了一种基于DL的方法,名为SRE - YOLO,通过自动从内镜白光(WL)和窄带成像(NBI)图像中检测不同尺度的病变,在喉部评估期间为经验较少的人员提供实时协助。
在训练过程中,SRE - YOLO将YOLOv8纳米(YOLOv8n)基线与超分辨率(SR)分支集成,以增强病变检测。最后这个组件在推理过程中解耦,以保持YOLOv8n基线的低计算需求。评估是在一个多中心数据集上进行的,该数据集涵盖了各种喉部病变和采集方式。
与YOLOv8n基线相比,SRE - YOLO方法在病变检测中的平均精度(AP)提高了5%,同时保持了每秒58.8帧(FPS)的推理速度。与现有最先进的DL方法的比较分析突出了SRE - YOLO方法在平衡检测准确性、计算效率和实时适用性方面的有效性。
本研究强调了SRE - YOLO在开发高效的基于DL的决策支持系统以从WL和NBI内镜数据实时检测不同尺度喉部病变方面的潜力。