• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于混合机器学习的方法用于在连贯语音期间对声带边缘进行解析表示。

A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.

作者信息

Yousef Ahmed M, Deliyski Dimitar D, Zacharias Stephanie R C, de Alarcon Alessandro, Orlikoff Robert F, Naghibolhosseini Maryam

机构信息

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USA.

Head and Neck Regenerative Medicine Program, Mayo Clinic, Scottsdale, AZ 85259, and Department of Otolaryngology-Head and Neck Surgery, Mayo Clinic, Phoenix, AZ 85054, USA.

出版信息

Appl Sci (Basel). 2021 Feb;11(3). doi: 10.3390/app11031179. Epub 2021 Jan 27.

DOI:10.3390/app11031179
PMID:33717604
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7954580/
Abstract

Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the "Rainbow Passage." The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech.

摘要

利用高速视频内镜(HSV)研究连贯语音中的发声过程需要在振动过程中准确检测声带边缘。本文提出了一种新的时空技术,用于在连续语音期间自动分割HSV数据中的声带边缘。HSV数据是在一名嗓音正常的成年人朗读《彩虹篇章》时记录的。所介绍的技术基于一种无监督机器学习(ML)方法与主动轮廓建模(ACM)技术相结合(也称为混合方法)。实施该混合方法以捕捉不同HSV波形图上的声带边缘,这些波形图是在振动期间从声带的各个横截面提取的。首先应用ML方法中的k均值聚类方法对波形图进行聚类,以识别聚类的声门区域,从而为ACM提供初始化轮廓。然后使用ACM算法精确检测振动声带的声门边缘。所开发的算法能够以低计算成本跨帧准确跟踪声带边缘,并且对图像噪声具有高鲁棒性。该算法为分析连贯语音中声带的振动特征提供了一个完全自动化的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/e0d120f9c295/nihms-1674135-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/c57b59e3226d/nihms-1674135-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/c6bb23b23bc4/nihms-1674135-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/36c4d1682b8e/nihms-1674135-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/d180c495bce4/nihms-1674135-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/09a744938c0d/nihms-1674135-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/e0d120f9c295/nihms-1674135-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/c57b59e3226d/nihms-1674135-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/c6bb23b23bc4/nihms-1674135-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/36c4d1682b8e/nihms-1674135-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/d180c495bce4/nihms-1674135-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/09a744938c0d/nihms-1674135-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8dd/7954580/e0d120f9c295/nihms-1674135-f0006.jpg

相似文献

1
A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech.一种基于混合机器学习的方法用于在连贯语音期间对声带边缘进行解析表示。
Appl Sci (Basel). 2021 Feb;11(3). doi: 10.3390/app11031179. Epub 2021 Jan 27.
2
Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连接语音中的喉高速视频内窥镜的空间分割。
J Voice. 2023 Jan;37(1):26-36. doi: 10.1016/j.jvoice.2020.10.017. Epub 2020 Nov 27.
3
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
4
Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。
J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.
5
Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy.高速视频内镜检查中内收型痉挛性发声障碍患者连贯言语时基于深度学习的声带动力学表现
J Voice. 2025 Mar;39(2):570.e1-570.e15. doi: 10.1016/j.jvoice.2022.08.022. Epub 2022 Sep 23.
6
Temporal Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连续语音中喉部高速视频内窥镜检查的时间分割
J Voice. 2018 Mar;32(2):256.e1-256.e12. doi: 10.1016/j.jvoice.2017.05.014. Epub 2017 Jun 21.
7
Optimal Deep Learning-Based Vocal Fold Disorder Detection and Classification Model on High-Speed Video Endoscopy.基于深度学习的高速视频内窥镜声带疾病检测与分类最优模型。
J Healthc Eng. 2022 Oct 17;2022:4248938. doi: 10.1155/2022/4248938. eCollection 2022.
8
Deep Learning-Based Analysis of Glottal Attack and Offset Times in Adductor Laryngeal Dystonia.基于深度学习的内收型喉肌张力障碍中声门起音和终止时间分析
J Voice. 2023 Nov 15. doi: 10.1016/j.jvoice.2023.10.011.
9
Empirical Distribution of Glottal Edges (EDGE): A Statistical Assessment of Vocal Fold Kinematics Using High-Speed Videoendoscopy.声门边缘的经验分布(EDGE):使用高速视频内窥镜对声带运动学的统计评估。
IEEE J Biomed Health Inform. 2025 Feb;29(2):1087-1100. doi: 10.1109/JBHI.2024.3462632. Epub 2025 Feb 10.
10
Synthetic multi-line kymographic analysis: A spatiotemporal data reduction technique for high-speed videoendoscopy.
J Acoust Soc Am. 2016 Oct;140(4):2703. doi: 10.1121/1.4964400.

引用本文的文献

1
Male-female specific changes in voice parameters under varying room acoustics.在不同室内声学条件下,嗓音参数的男女特异性变化。
Proc Meet Acoust. 2024 Nov 18;55(1). doi: 10.1121/2.0001979. Epub 2024 Dec 11.
2
Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.嗓音障碍筛查:声学嗓音质量指数、谐波峰值突出度与机器学习
Folia Phoniatr Logop. 2025 Feb 21:1-15. doi: 10.1159/000544852.
3
Supraglottic Laryngeal Maneuvers in Adductor Laryngeal Dystonia During Connected Speech.连串言语期间内收性喉肌张力障碍中的声门上喉部手法

本文引用的文献

1
Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连接语音中的喉高速视频内窥镜的空间分割。
J Voice. 2023 Jan;37(1):26-36. doi: 10.1016/j.jvoice.2020.10.017. Epub 2020 Nov 27.
2
Rethinking glottal midline detection.重新思考声门中线检测。
Sci Rep. 2020 Nov 26;10(1):20723. doi: 10.1038/s41598-020-77216-6.
3
BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.BAGLS,一个用于自动声门分割的多医院基准测试。
J Voice. 2024 Aug 30. doi: 10.1016/j.jvoice.2024.08.009.
4
Deep Learning-Based Analysis of Glottal Attack and Offset Times in Adductor Laryngeal Dystonia.基于深度学习的内收型喉肌张力障碍中声门起音和终止时间分析
J Voice. 2023 Nov 15. doi: 10.1016/j.jvoice.2023.10.011.
5
Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos.用于内镜高速视频中声门分割的卷积神经网络再训练
Appl Sci (Basel). 2022 Oct;12(19). doi: 10.3390/app12199791. Epub 2022 Sep 28.
6
Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech.连接性言语中内收型痉挛性发声障碍声门起音/终音时间的喉部影像学研究
Appl Sci (Basel). 2023 Mar 1;13(5). doi: 10.3390/app13052979. Epub 2023 Feb 25.
7
Localization and quantification of glottal gaps on deep learning segmentation of vocal folds.基于深度学习的声带分割中声门裂的定位与量化。
Sci Rep. 2023 Jan 17;13(1):878. doi: 10.1038/s41598-023-27980-y.
8
Optimal Deep Learning-Based Vocal Fold Disorder Detection and Classification Model on High-Speed Video Endoscopy.基于深度学习的高速视频内窥镜声带疾病检测与分类最优模型。
J Healthc Eng. 2022 Oct 17;2022:4248938. doi: 10.1155/2022/4248938. eCollection 2022.
9
Deep-Learning-Based Representation of Vocal Fold Dynamics in Adductor Spasmodic Dysphonia during Connected Speech in High-Speed Videoendoscopy.高速视频内镜检查中内收型痉挛性发声障碍患者连贯言语时基于深度学习的声带动力学表现
J Voice. 2025 Mar;39(2):570.e1-570.e15. doi: 10.1016/j.jvoice.2022.08.022. Epub 2022 Sep 23.
10
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
Sci Data. 2020 Jun 19;7(1):186. doi: 10.1038/s41597-020-0526-3.
4
Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.使用深度卷积长短期记忆网络对喉内窥镜高速视频中的声门和声带进行全自动分割。
PLoS One. 2020 Feb 10;15(2):e0227791. doi: 10.1371/journal.pone.0227791. eCollection 2020.
5
Temporal Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech.连续语音中喉部高速视频内窥镜检查的时间分割
J Voice. 2018 Mar;32(2):256.e1-256.e12. doi: 10.1016/j.jvoice.2017.05.014. Epub 2017 Jun 21.
6
Comparison of Videostroboscopy and High-speed Videoendoscopy in Evaluation of Supraglottic Phonation.频闪喉镜与高速视频喉镜在评估声门上发声中的比较。
Ann Otol Rhinol Laryngol. 2016 Oct;125(10):829-37. doi: 10.1177/0003489416656205. Epub 2016 Jul 12.
7
Tracing vocal fold vibrations using level set segmentation method.
Int J Numer Method Biomed Eng. 2015 Jun;31(6). doi: 10.1002/cnm.2715. Epub 2015 Apr 17.
8
Observation and analysis of in vivo vocal fold tissue instabilities produced by nonlinear source-filter coupling: a case study.体内声带组织不稳定性的非线性源滤波器耦合产生的观察和分析:案例研究。
J Acoust Soc Am. 2011 Jan;129(1):326-39. doi: 10.1121/1.3514536.
9
Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings.基于高速视频内窥镜记录的声带振动不对称性的自动测量。
J Speech Lang Hear Res. 2011 Feb;54(1):47-54. doi: 10.1044/1092-4388(2010/10-0026). Epub 2010 Aug 10.
10
Voice production mechanisms following phonosurgical treatment of early glottic cancer.早期声门癌喉手术治疗后的发声机制
Ann Otol Rhinol Laryngol. 2010 Jan;119(1):1-9. doi: 10.1177/000348941011900101.