Department of Otorhinolaryngology, Marmara University Faculty of Medicine, Pendik Training and Research Hospital, Fevzi Çakmak Muhsin Yazıcıoğlu Street, İstanbul, 34899, Turkey.
VRLab Academy, 32 Willoughby Rd, Harringay Ladder, London, N8 0JG, UK.
Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6083-6091. doi: 10.1007/s00405-024-08801-y. Epub 2024 Jul 13.
To develop a convolutional neural network (CNN)-based model for classifying videostroboscopic images of patients with sulcus, benign vocal fold (VF) lesions, and healthy VFs to improve clinicians' accuracy in diagnosis during videostroboscopies when evaluating sulcus.
Videostroboscopies of 433 individuals who were diagnosed with sulcus (91), who were diagnosed with benign VF diseases (i.e., polyp, nodule, papilloma, cyst, or pseudocyst [311]), or who were healthy (33) were analyzed. After extracting 91,159 frames from videostroboscopies, a CNN-based model was created and tested. The healthy and sulcus groups underwent binary classification. In the second phase of the study, benign VF lesions were added to the training set, and multiclassification was executed across all groups. The proposed CNN-based model results were compared with five laryngology experts' assessments.
In the binary classification phase, the CNN-based model achieved 98% accuracy, 98% recall, 97% precision, and a 97% F1 score for classifying sulcus and healthy VFs. During the multiclassification phase, when evaluated on a subset of frames encompassing all included groups, the CNN-based model demonstrated greater accuracy when compared with that of the five laryngologists (%76 versus 72%, 68%, 72%, 63%, and 72%).
The utilization of a CNN-based model serves as a significant aid in the diagnosis of sulcus, a VF disease that presents notable challenges in the diagnostic process. Further research could be undertaken to assess the practicality of implementing this approach in real-time application in clinical practice.
开发一种基于卷积神经网络(CNN)的模型,用于对患有沟状、良性声带(VF)病变和健康 VF 的患者的频闪图像进行分类,以提高临床医生在评估沟状时进行频闪检查时的诊断准确性。
对 433 名被诊断为沟状(91 名)、被诊断为良性 VF 疾病(即息肉、结节、乳头状瘤、囊肿或假性囊肿[311])或健康(33 名)的个体的频闪图像进行了分析。从频闪图像中提取 91159 帧后,创建并测试了一个基于 CNN 的模型。健康组和沟状组进行了二分类。在研究的第二阶段,将良性 VF 病变添加到训练集中,并对所有组进行多分类。将基于 CNN 的模型的结果与五位喉科专家的评估进行了比较。
在二分类阶段,基于 CNN 的模型对沟状和健康 VF 的分类准确率为 98%,召回率为 98%,精度为 97%,F1 得分为 97%。在多分类阶段,当在包含所有纳入组的一组帧上进行评估时,与五位喉科医生相比,基于 CNN 的模型的准确率更高(分别为 76%、68%、72%、63%和 72%)。
基于 CNN 的模型的使用为 VF 疾病沟状的诊断提供了重要帮助,该疾病在诊断过程中具有显著的挑战。可以进一步研究评估在临床实践中实时应用中实施该方法的实用性。