• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。

Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.

机构信息

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.

Head and Neck Regenerative Medicine Program, Mayo Clinic, Scottsdale, Arizona; Department of Otolaryngology-Head and Neck Surgery, Mayo Clinic, Phoenix, Arizona.

出版信息

J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.

DOI:10.1016/j.jvoice.2022.01.028
PMID:35304042
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9474736/
Abstract

OBJECTIVE

Adductor spasmodic dysphonia (AdSD) is a neurogenic voice disorder, affecting the intrinsic laryngeal muscle control. AdSD leads to involuntary laryngeal spasms and only reveals during connected speech. Laryngeal high-speed videoendoscopy (HSV) coupled with a flexible fiberoptic endoscope provides a unique opportunity to study voice production and visualize the vocal fold vibrations in AdSD during speech. The goal of this study is to automatically detect instances during which the image of the vocal folds is optically obstructed in HSV recordings obtained during connected speech.

METHODS

HSV data were recorded from vocally normal adults and patients with AdSD during reading of the "Rainbow Passage", six CAPE-V sentences, and production of the vowel /i/. A convolutional neural network was developed and trained as a classifier to detect obstructed/unobstructed vocal folds in HSV frames. Manually labelled data were used for training, validating, and testing of the network. Moreover, a comprehensive robustness evaluation was conducted to compare the performance of the developed classifier and visual analysis of HSV data.

RESULTS

The developed convolutional neural network was able to automatically detect the vocal fold obstructions in HSV data in vocally normal participants and AdSD patients. The trained network was tested successfully and showed an overall classification accuracy of 94.18% on the testing dataset. The robustness evaluation showed an average overall accuracy of 94.81% on a massive number of HSV frames demonstrating the high robustness of the introduced technique while keeping a high level of accuracy.

CONCLUSIONS

The proposed approach can be used for efficient analysis of HSV data to study laryngeal maneuvers in patients with AdSD during connected speech. Additionally, this method will facilitate development of vocal fold vibratory measures for HSV frames with an unobstructed view of the vocal folds. Indicating parts of connected speech that provide an unobstructed view of the vocal folds can be used for developing optimal passages for precise HSV examination during connected speech and subject-specific clinical voice assessment protocols.

摘要

目的

内收肌痉挛性发音障碍(AdSD)是一种神经源性嗓音障碍,影响内在喉肌的控制。AdSD 导致喉内不自觉痉挛,仅在连续言语时显现。喉高速视频内镜(HSV)结合纤维光学内窥镜提供了一个独特的机会,可以在连续言语中研究嗓音产生并可视化 AdSD 患者声带振动。本研究的目的是自动检测 HSV 记录中声带光学遮挡的实例,这些记录是在正常成人和 AdSD 患者朗读“彩虹通道”、六个 CAPE-V 句子以及发元音/i/时获得的。开发并训练了一个卷积神经网络作为分类器,以检测 HSV 帧中声带的遮挡/未遮挡。使用手动标记数据进行网络的训练、验证和测试。此外,还进行了全面的稳健性评估,比较了所开发分类器的性能和 HSV 数据的视觉分析。

方法

从正常成人和 AdSD 患者朗读“彩虹通道”、六个 CAPE-V 句子以及发元音/i/时,记录 HSV 数据。开发并训练了一个卷积神经网络作为分类器,以检测 HSV 帧中声带的遮挡/未遮挡。使用手动标记数据进行网络的训练、验证和测试。此外,还进行了全面的稳健性评估,比较了所开发分类器的性能和 HSV 数据的视觉分析。

结果

所开发的卷积神经网络能够自动检测正常成人和 AdSD 患者 HSV 数据中的声带遮挡。训练好的网络在测试数据上进行了成功测试,总体分类准确率为 94.18%。稳健性评估表明,在大量 HSV 帧上的平均总体准确率为 94.81%,表明所提出的技术具有很高的稳健性,同时保持了很高的准确性。

结论

该方法可用于高效分析 HSV 数据,以研究 AdSD 患者在连续言语时的喉运动。此外,该方法将有助于为具有声带清晰视野的 HSV 帧开发声带振动测量方法。指示连续言语中提供声带清晰视野的部分可用于开发用于精确 HSV 检查的最佳段落以及基于特定患者的临床嗓音评估方案。

相似文献

1
Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。
J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.
2
Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy.高速视频内镜检查中内收型痉挛性发声障碍患者连贯言语时基于深度学习的声带动力学表现
J Voice. 2025 Mar;39(2):570.e1-570.e15. doi: 10.1016/j.jvoice.2022.08.022. Epub 2022 Sep 23.
3
Vibratory Onset of Adductor Spasmodic Dysphonia and Muscle Tension Dysphonia: A High-Speed Video Study✰.《Adductor 痉挛性发声障碍和肌肉紧张性发声障碍的振动起始:高速视频研究✰》。
J Voice. 2020 Jul;34(4):598-603. doi: 10.1016/j.jvoice.2018.12.010. Epub 2018 Dec 28.
4
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
5
Temporal Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连续语音中喉部高速视频内窥镜检查的时间分割
J Voice. 2018 Mar;32(2):256.e1-256.e12. doi: 10.1016/j.jvoice.2017.05.014. Epub 2017 Jun 21.
6
Sulcus vocalis in spasmodic dysphonia-A retrospective study.痉挛性发音障碍中的声带沟:一项回顾性研究。
Am J Otolaryngol. 2021 May-Jun;42(3):102940. doi: 10.1016/j.amjoto.2021.102940. Epub 2021 Jan 28.
7
A Measure of the Auditory-perceptual Quality of Strain from Electroglottographic Analysis of Continuous Dysphonic Speech: Application to Adductor Spasmodic Dysphonia.通过对持续性发声障碍语音进行电子声门图分析来测量嗓音紧张度的听觉感知质量:应用于内收型痉挛性发声障碍。
J Voice. 2016 Nov;30(6):770.e9-770.e21. doi: 10.1016/j.jvoice.2015.11.005. Epub 2015 Dec 28.
8
Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连接语音中的喉高速视频内窥镜的空间分割。
J Voice. 2023 Jan;37(1):26-36. doi: 10.1016/j.jvoice.2020.10.017. Epub 2020 Nov 27.
9
Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech.连接性言语中内收型痉挛性发声障碍声门起音/终音时间的喉部影像学研究
Appl Sci (Basel). 2023 Mar 1;13(5). doi: 10.3390/app13052979. Epub 2023 Feb 25.
10
Reliability of High-speed Videoendoscopic Ratings of Essential Voice Tremor and Adductor Spasmodic Dysphonia.高速视频内镜对特发性语音震颤和内收型痉挛性发声障碍评分的可靠性
J Voice. 2019 Jan;33(1):16-26. doi: 10.1016/j.jvoice.2017.10.009. Epub 2017 Dec 13.

引用本文的文献

1
Mapping 74 years in acoustic analysis of voice disorders: A bibliometric review and future research directions.嗓音障碍声学分析74年图谱:文献计量学综述与未来研究方向
J Commun Disord. 2025 Jul 11;117:106555. doi: 10.1016/j.jcomdis.2025.106555.
2
Male-female specific changes in voice parameters under varying room acoustics.在不同室内声学条件下,嗓音参数的男女特异性变化。
Proc Meet Acoust. 2024 Nov 18;55(1). doi: 10.1121/2.0001979. Epub 2024 Dec 11.
3
Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.嗓音障碍筛查:声学嗓音质量指数、谐波峰值突出度与机器学习
Folia Phoniatr Logop. 2025 Feb 21:1-15. doi: 10.1159/000544852.
4
Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions.模拟混响条件下声学语音质量测量的敏感性
Bioengineering (Basel). 2024 Dec 11;11(12):1253. doi: 10.3390/bioengineering11121253.
5
Supraglottic Laryngeal Maneuvers in Adductor Laryngeal Dystonia During Connected Speech.连串言语期间内收性喉肌张力障碍中的声门上喉部手法
J Voice. 2024 Aug 30. doi: 10.1016/j.jvoice.2024.08.009.
6
Improving Laryngoscopy Image Analysis Through Integration of Global Information and Local Features in VoFoCD Dataset.通过在VoFoCD数据集中整合全局信息和局部特征改进喉镜图像分析
J Imaging Inform Med. 2024 Dec;37(6):2794-2809. doi: 10.1007/s10278-024-01068-z. Epub 2024 May 29.
7
Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting.迈向语音、语言和听力科学中的通用机器学习模型:估计样本量并减少过拟合。
J Speech Lang Hear Res. 2024 Mar 11;67(3):753-781. doi: 10.1044/2023_JSLHR-23-00273. Epub 2024 Feb 22.
8
Deep Learning-Based Analysis of Glottal Attack and Offset Times in Adductor Laryngeal Dystonia.基于深度学习的内收型喉肌张力障碍中声门起音和终止时间分析
J Voice. 2023 Nov 15. doi: 10.1016/j.jvoice.2023.10.011.
9
Videostroboscopy Versus High-Speed Videoendoscopy: Factors Influencing Ratings of Laryngeal Oscillation.频闪喉镜与高速视频喉镜:影响喉部振动评估的因素。
J Speech Lang Hear Res. 2023 May 9;66(5):1496-1510. doi: 10.1044/2023_JSLHR-22-00649. Epub 2023 Apr 11.
10
Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech.连接性言语中内收型痉挛性发声障碍声门起音/终音时间的喉部影像学研究
Appl Sci (Basel). 2023 Mar 1;13(5). doi: 10.3390/app13052979. Epub 2023 Feb 25.

本文引用的文献

1
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
2
Comparative analysis of high-speed videolaryngoscopy images and sound data simultaneously acquired from rigid and flexible laryngoscope: a pilot study.同时获取硬质和软质喉镜的高速视频喉镜图像和声数据的对比分析:一项初步研究。
Sci Rep. 2021 Oct 14;11(1):20480. doi: 10.1038/s41598-021-99948-9.
3
Diagnostic Accuracies of Laryngeal Diseases Using a Convolutional Neural Network-Based Image Classification System.基于卷积神经网络的图像分类系统对喉部疾病的诊断准确性。
Laryngoscope. 2021 Nov;131(11):2558-2566. doi: 10.1002/lary.29595. Epub 2021 May 17.
4
A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.一种基于混合机器学习的方法用于在连贯语音期间对声带边缘进行解析表示。
Appl Sci (Basel). 2021 Feb;11(3). doi: 10.3390/app11031179. Epub 2021 Jan 27.
5
Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连接语音中的喉高速视频内窥镜的空间分割。
J Voice. 2023 Jan;37(1):26-36. doi: 10.1016/j.jvoice.2020.10.017. Epub 2020 Nov 27.
6
Objective Measures of Stroboscopy and High-Speed Video.频闪喉镜和高速视频的客观测量
Adv Otorhinolaryngol. 2020;85:25-44. doi: 10.1159/000456681. Epub 2020 Nov 9.
7
Transfer learning for informative-frame selection in laryngoscopic videos through learned features.通过学习特征进行喉镜视频中信息帧选择的迁移学习
Med Biol Eng Comput. 2020 Jun;58(6):1225-1238. doi: 10.1007/s11517-020-02127-7. Epub 2020 Mar 24.
8
Automatic Recognition of Laryngoscopic Images Using a Deep-Learning Technique.使用深度学习技术自动识别喉镜图像。
Laryngoscope. 2020 Nov;130(11):E686-E693. doi: 10.1002/lary.28539. Epub 2020 Feb 18.
9
Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images.基于喉镜图像的深度学习辅助喉癌计算机辅助诊断。
EBioMedicine. 2019 Oct;48:92-99. doi: 10.1016/j.ebiom.2019.08.075. Epub 2019 Oct 5.
10
A guide to deep learning in healthcare.深度学习在医疗保健中的应用指南。
Nat Med. 2019 Jan;25(1):24-29. doi: 10.1038/s41591-018-0316-z. Epub 2019 Jan 7.