基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。

Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.

机构信息

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.

Head and Neck Regenerative Medicine Program, Mayo Clinic, Scottsdale, Arizona; Department of Otolaryngology-Head and Neck Surgery, Mayo Clinic, Phoenix, Arizona.

出版信息

J Voice. 2024 Jul;38(4):951-962. doi: 10.1016/j.jvoice.2022.01.028. Epub 2022 Mar 16.

DOI:10.1016/j.jvoice.2022.01.028

PMID:35304042

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9474736/

Abstract

OBJECTIVE

Adductor spasmodic dysphonia (AdSD) is a neurogenic voice disorder, affecting the intrinsic laryngeal muscle control. AdSD leads to involuntary laryngeal spasms and only reveals during connected speech. Laryngeal high-speed videoendoscopy (HSV) coupled with a flexible fiberoptic endoscope provides a unique opportunity to study voice production and visualize the vocal fold vibrations in AdSD during speech. The goal of this study is to automatically detect instances during which the image of the vocal folds is optically obstructed in HSV recordings obtained during connected speech.

METHODS

HSV data were recorded from vocally normal adults and patients with AdSD during reading of the "Rainbow Passage", six CAPE-V sentences, and production of the vowel /i/. A convolutional neural network was developed and trained as a classifier to detect obstructed/unobstructed vocal folds in HSV frames. Manually labelled data were used for training, validating, and testing of the network. Moreover, a comprehensive robustness evaluation was conducted to compare the performance of the developed classifier and visual analysis of HSV data.

RESULTS

The developed convolutional neural network was able to automatically detect the vocal fold obstructions in HSV data in vocally normal participants and AdSD patients. The trained network was tested successfully and showed an overall classification accuracy of 94.18% on the testing dataset. The robustness evaluation showed an average overall accuracy of 94.81% on a massive number of HSV frames demonstrating the high robustness of the introduced technique while keeping a high level of accuracy.

CONCLUSIONS

The proposed approach can be used for efficient analysis of HSV data to study laryngeal maneuvers in patients with AdSD during connected speech. Additionally, this method will facilitate development of vocal fold vibratory measures for HSV frames with an unobstructed view of the vocal folds. Indicating parts of connected speech that provide an unobstructed view of the vocal folds can be used for developing optimal passages for precise HSV examination during connected speech and subject-specific clinical voice assessment protocols.

摘要

目的

内收肌痉挛性发音障碍（AdSD）是一种神经源性嗓音障碍，影响内在喉肌的控制。AdSD 导致喉内不自觉痉挛，仅在连续言语时显现。喉高速视频内镜（HSV）结合纤维光学内窥镜提供了一个独特的机会，可以在连续言语中研究嗓音产生并可视化 AdSD 患者声带振动。本研究的目的是自动检测 HSV 记录中声带光学遮挡的实例，这些记录是在正常成人和 AdSD 患者朗读“彩虹通道”、六个 CAPE-V 句子以及发元音/i/时获得的。开发并训练了一个卷积神经网络作为分类器，以检测 HSV 帧中声带的遮挡/未遮挡。使用手动标记数据进行网络的训练、验证和测试。此外，还进行了全面的稳健性评估，比较了所开发分类器的性能和 HSV 数据的视觉分析。

方法

从正常成人和 AdSD 患者朗读“彩虹通道”、六个 CAPE-V 句子以及发元音/i/时，记录 HSV 数据。开发并训练了一个卷积神经网络作为分类器，以检测 HSV 帧中声带的遮挡/未遮挡。使用手动标记数据进行网络的训练、验证和测试。此外，还进行了全面的稳健性评估，比较了所开发分类器的性能和 HSV 数据的视觉分析。

结果

所开发的卷积神经网络能够自动检测正常成人和 AdSD 患者 HSV 数据中的声带遮挡。训练好的网络在测试数据上进行了成功测试，总体分类准确率为 94.18%。稳健性评估表明，在大量 HSV 帧上的平均总体准确率为 94.81%，表明所提出的技术具有很高的稳健性，同时保持了很高的准确性。

结论

该方法可用于高效分析 HSV 数据，以研究 AdSD 患者在连续言语时的喉运动。此外，该方法将有助于为具有声带清晰视野的 HSV 帧开发声带振动测量方法。指示连续言语中提供声带清晰视野的部分可用于开发用于精确 HSV 检查的最佳段落以及基于特定患者的临床嗓音评估方案。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。

Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

基于卷积神经网络的痉挛性发声障碍患者连接性言语时高速视频内镜下声带图像遮挡的检测。

Detection of Vocal Fold Image Obstructions in High-Speed Videoendoscopy During Connected Speech in Adductor Spasmodic Dysphonia: A Convolutional Neural Networks Approach.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献