• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过在VoFoCD数据集中整合全局信息和局部特征改进喉镜图像分析

Improving Laryngoscopy Image Analysis Through Integration of Global Information and Local Features in VoFoCD Dataset.

作者信息

Dao Thao Thi Phuong, Huynh Tuan-Luc, Pham Minh-Khoi, Le Trung-Nghia, Nguyen Tan-Cong, Nguyen Quang-Thuc, Tran Bich Anh, Van Boi Ngoc, Ha Chanh Cong, Tran Minh-Triet

机构信息

University of Science, Ho Chi Minh City, Vietnam.

John von Neumann Institute, Ho Chi Minh City, Vietnam.

出版信息

J Imaging Inform Med. 2024 Dec;37(6):2794-2809. doi: 10.1007/s10278-024-01068-z. Epub 2024 May 29.

DOI:10.1007/s10278-024-01068-z
PMID:38809338
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11612113/
Abstract

The diagnosis and treatment of vocal fold disorders heavily rely on the use of laryngoscopy. A comprehensive vocal fold diagnosis requires accurate identification of crucial anatomical structures and potential lesions during laryngoscopy observation. However, existing approaches have yet to explore the joint optimization of the decision-making process, including object detection and image classification tasks simultaneously. In this study, we provide a new dataset, VoFoCD, with 1724 laryngology images designed explicitly for object detection and image classification in laryngoscopy images. Images in the VoFoCD dataset are categorized into four classes and comprise six glottic object types. Moreover, we propose a novel Multitask Efficient trAnsformer network for Laryngoscopy (MEAL) to classify vocal fold images and detect glottic landmarks and lesions. To further facilitate interpretability for clinicians, MEAL provides attention maps to visualize important learned regions for explainable artificial intelligence results toward supporting clinical decision-making. We also analyze our model's effectiveness in simulated clinical scenarios where shaking of the laryngoscopy process occurs. The proposed model demonstrates outstanding performance on our VoFoCD dataset. The accuracy for image classification and mean average precision at an intersection over a union threshold of 0.5 (mAP50) for object detection are 0.951 and 0.874, respectively. Our MEAL method integrates global knowledge, encompassing general laryngoscopy image classification, into local features, which refer to distinct anatomical regions of the vocal fold, particularly abnormal regions, including benign and malignant lesions. Our contribution can effectively aid laryngologists in identifying benign or malignant lesions of vocal folds and classifying images in the laryngeal endoscopy process visually.

摘要

声带疾病的诊断和治疗严重依赖于喉镜检查的使用。全面的声带诊断需要在喉镜检查观察过程中准确识别关键解剖结构和潜在病变。然而,现有方法尚未探索决策过程的联合优化,包括同时进行目标检测和图像分类任务。在本研究中,我们提供了一个新的数据集VoFoCD,其中包含1724张专门为喉镜图像中的目标检测和图像分类设计的喉科学图像。VoFoCD数据集中的图像分为四类,包含六种声门目标类型。此外,我们提出了一种用于喉镜检查的新型多任务高效Transformer网络(MEAL),用于对声带图像进行分类,并检测声门标志和病变。为了进一步提高临床医生的可解释性,MEAL提供注意力图,以可视化重要的学习区域,从而为支持临床决策的可解释人工智能结果提供支持。我们还分析了我们的模型在喉镜检查过程发生抖动的模拟临床场景中的有效性。所提出的模型在我们的VoFoCD数据集上表现出色。图像分类的准确率和目标检测在交并比阈值为0.5时的平均精度均值(mAP50)分别为0.951和0.874。我们的MEAL方法将包括一般喉镜图像分类在内的全局知识整合到局部特征中,这些局部特征指的是声带的不同解剖区域,特别是异常区域,包括良性和恶性病变。我们的贡献可以有效地帮助喉科医生在喉镜检查过程中直观地识别声带的良性或恶性病变并对图像进行分类。

相似文献

1
Improving Laryngoscopy Image Analysis Through Integration of Global Information and Local Features in VoFoCD Dataset.通过在VoFoCD数据集中整合全局信息和局部特征改进喉镜图像分析
J Imaging Inform Med. 2024 Dec;37(6):2794-2809. doi: 10.1007/s10278-024-01068-z. Epub 2024 May 29.
2
Support of deep learning to classify vocal fold images in flexible laryngoscopy.深度学习对柔性喉镜检查中声带图像进行分类的支持。
Am J Otolaryngol. 2023 May-Jun;44(3):103800. doi: 10.1016/j.amjoto.2023.103800. Epub 2023 Feb 24.
3
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
4
A Convolutional Neural Network for Real Time Classification, Identification, and Labelling of Vocal Cord and Tracheal Using Laryngoscopy and Bronchoscopy Video.基于喉镜和支气管镜视频的实时分类、识别和标记声带和气管的卷积神经网络
J Med Syst. 2020 Jan 2;44(2):44. doi: 10.1007/s10916-019-1481-4.
5
Optical Biopsy: Automated Classification of Airway Endoscopic Findings Using a Convolutional Neural Network.光学活检:使用卷积神经网络对气道内镜检查结果进行自动分类。
Laryngoscope. 2022 Feb;132 Suppl 4:S1-S8. doi: 10.1002/lary.28708. Epub 2020 Apr 28.
6
Automated detection of glottic laryngeal carcinoma in laryngoscopic images from a multicentre database using a convolutional neural network.基于卷积神经网络的多中心数据库声门型喉癌喉内镜图像的自动检测。
Clin Otolaryngol. 2023 May;48(3):436-441. doi: 10.1111/coa.14029. Epub 2023 Jan 20.
7
Vision-Based Assistance for Vocal Fold Identification in Laryngoscopy with Knowledge Distillation.基于视觉的声带识别辅助技术在带有知识蒸馏的喉镜检查中的应用。
Stud Health Technol Inform. 2024 Jan 25;310:946-950. doi: 10.3233/SHTI231104.
8
Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。
J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.
9
Quantitative laryngoscopy with computer-aided diagnostic system for laryngeal lesions.计算机辅助诊断系统下的定量喉镜用于喉部病变。
Sci Rep. 2021 May 12;11(1):10147. doi: 10.1038/s41598-021-89680-9.
10
Detection of laryngeal carcinoma during endoscopy using artificial intelligence.使用人工智能进行内窥镜下喉癌检测。
Head Neck. 2023 Sep;45(9):2217-2226. doi: 10.1002/hed.27441. Epub 2023 Jun 28.

本文引用的文献

1
A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy.一种用于在软性喉镜检查中自动分类声带息肉的深度学习流程。
Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2055-2062. doi: 10.1007/s00405-023-08190-8. Epub 2023 Sep 11.
2
Interpretable Computer Vision to Detect and Classify Structural Laryngeal Lesions in Digital Flexible Laryngoscopic Images.可解释的计算机视觉技术用于检测和分类数字化软性喉镜图像中的结构性喉病变。
Otolaryngol Head Neck Surg. 2023 Dec;169(6):1564-1572. doi: 10.1002/ohn.411. Epub 2023 Jun 23.
3
CDT-CAD: Context-Aware Deformable Transformers for End-to-End Chest Abnormality Detection on X-Ray Images.CDT-CAD:基于上下文感知可变形 Transformer 的端到端 X 射线图像胸部异常检测方法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Jul-Aug;21(4):823-834. doi: 10.1109/TCBB.2023.3258455. Epub 2024 Aug 8.
4
Support of deep learning to classify vocal fold images in flexible laryngoscopy.深度学习对柔性喉镜检查中声带图像进行分类的支持。
Am J Otolaryngol. 2023 May-Jun;44(3):103800. doi: 10.1016/j.amjoto.2023.103800. Epub 2023 Feb 24.
5
An improved faster R-CNN algorithm for assisted detection of lung nodules.一种改进的更快的 R-CNN 算法,用于辅助肺结节检测。
Comput Biol Med. 2023 Feb;153:106470. doi: 10.1016/j.compbiomed.2022.106470. Epub 2022 Dec 28.
6
Automatic classification of informative laryngoscopic images using deep learning.利用深度学习对信息丰富的喉镜图像进行自动分类。
Laryngoscope Investig Otolaryngol. 2022 Feb 8;7(2):460-466. doi: 10.1002/lio2.754. eCollection 2022 Apr.
7
Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。
J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.
8
Breast Tumor Detection and Classification in Mammogram Images Using Modified YOLOv5 Network.基于改进 YOLOv5 网络的乳腺钼靶图像肿瘤检测与分类。
Comput Math Methods Med. 2022 Jan 4;2022:1359019. doi: 10.1155/2022/1359019. eCollection 2022.
9
Polyp Detection from Colorectum Images by Using Attentive YOLOv5.使用注意力增强的YOLOv5从直肠图像中检测息肉
Diagnostics (Basel). 2021 Dec 3;11(12):2264. doi: 10.3390/diagnostics11122264.
10
Deep Convolution Neural Network for Laryngeal Cancer Classification on Contact Endoscopy-Narrow Band Imaging.基于接触式内镜-窄带成像的喉癌深度卷积神经网络分类。
Sensors (Basel). 2021 Dec 6;21(23):8157. doi: 10.3390/s21238157.