Suppr超能文献

嗓音障碍筛查:声学嗓音质量指数、谐波峰值突出度与机器学习

Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.

作者信息

Yousef Ahmed M, Castillo-Allendes Adrián, Berardi Mark L, Codino Juliana, Rubin Adam D, Hunter Eric J

机构信息

Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA.

Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA.

出版信息

Folia Phoniatr Logop. 2025 Feb 21:1-15. doi: 10.1159/000544852.

Abstract

INTRODUCTION

The Acoustic Voice Quality Index (AVQI) and Smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessment of voice quality in persons seeking voice care across many languages. This study aimed to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.

METHODS

This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (random forest, k-nearest neighbors, support vector machine, and decision tree) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden Index.

RESULTS

A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered a better balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.

CONCLUSIONS

ML shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.

摘要

引言

据报道,声学语音质量指数(AVQI)和平滑谐波峰值突出度(CPPs)能有效辅助多种语言人群的嗓音质量评估。本研究旨在评估这两种指标在美国英语使用者中检测嗓音障碍的诊断准确性,并将其性能与机器学习(ML)模型进行比较。

方法

这项回顾性研究纳入了187名参与者:138名临床诊断为嗓音障碍的患者和49名嗓音健康的个体。每位参与者完成两项发声任务:持续发[a:]元音和生成一段连续语流样本,然后将这些样本拼接起来。使用VOXplot软件对这些样本进行AVQI-3(版本03.01)和CPPs分析。此外,还训练了四个ML模型(随机森林、k近邻、支持向量机和决策树)用于比较。使用包括受试者工作特征曲线和尤登指数在内的各种评估指标评估这两种指标和模型的诊断准确性。

结果

确定AVQI-3检测嗓音障碍的临界值为1.54(灵敏度为55%,特异度为80%),CPPs的临界值为14.35 dB(灵敏度为65%,特异度为78%)。与平均ML灵敏度89%和特异度55%相比,CPPs在灵敏度和特异度之间实现了更好的平衡,优于AVQI-3,且与ML平均性能相近。

结论

ML在支持嗓音障碍诊断方面显示出巨大潜力,尤其是随着模型变得更具通用性且更易于解释。然而,目前像AVQI-3和CPPs这样的工具在评估嗓音质量方面比常用模型在临床应用中更实用、更易获取。特别是,CPPs在识别嗓音障碍方面具有明显优势,使其成为资源有限诊所的推荐且可行的选择。

相似文献

2
Voice disorder discrimination using vowel acoustic measures in female speakers.
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):2087-2102. doi: 10.1111/1460-6984.13081. Epub 2024 Jun 17.
5
Meta-Analysis on the Validity of the Acoustic Voice Quality Index.
J Voice. 2024 Nov;38(6):1527.e1-1527.e19. doi: 10.1016/j.jvoice.2022.04.022. Epub 2022 Jun 23.
7
Effectiveness of voice rehabilitation on vocalisation in postlaryngectomy patients: a systematic review.
Int J Evid Based Healthc. 2010 Dec;8(4):256-8. doi: 10.1111/j.1744-1609.2010.00177.x.

引用本文的文献

1
SHAP-Based Identification of Potential Acoustic Biomarkers in Patients with Post-Thyroidectomy Voice Disorder.
Diagnostics (Basel). 2025 Aug 18;15(16):2065. doi: 10.3390/diagnostics15162065.

本文引用的文献

1
Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting.
J Speech Lang Hear Res. 2024 Mar 11;67(3):753-781. doi: 10.1044/2023_JSLHR-23-00273. Epub 2024 Feb 22.
5
Relative importance of speech and voice features in the classification of schizophrenia and depression.
Transl Psychiatry. 2023 Sep 19;13(1):298. doi: 10.1038/s41398-023-02594-0.
7
Revisiting the Rainbow: Culturally Responsive Updates to a Standard Clinical Resource.
Am J Speech Lang Pathol. 2023 Jan 11;32(1):377-380. doi: 10.1044/2022_AJSLP-22-00215. Epub 2022 Nov 10.
10
Meta-Analysis on the Validity of the Acoustic Voice Quality Index.
J Voice. 2024 Nov;38(6):1527.e1-1527.e19. doi: 10.1016/j.jvoice.2022.04.022. Epub 2022 Jun 23.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验