• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用神经嵌入检测嗓音疲劳。

Detecting Vocal Fatigue with Neural Embeddings.

作者信息

Bayerl Sebastian P, Wagner Dominik, Baumann Ilja, Bocklet Tobias, Riedhammer Korbinian

机构信息

Technische Hochschule Nürnberg Georg Simon Ohm.

Technische Hochschule Nürnberg Georg Simon Ohm.

出版信息

J Voice. 2023 Feb 9. doi: 10.1016/j.jvoice.2023.01.012.

DOI:10.1016/j.jvoice.2023.01.012
PMID:36774263
Abstract

Vocal fatigue refers to the feeling of tiredness and weakness of voice due to extended utilization. This paper investigates the effectiveness of neural embeddings for the detection of vocal fatigue. We compare x-vectors, ECAPA-TDNN, and wav2vec 2.0 embeddings on a corpus of academic spoken English. Low-dimensional mappings of the data reveal that neural embeddings capture information about the change in vocal characteristics of a speaker during prolonged voice usage. We show that vocal fatigue can be reliably predicted using all three types of neural embeddings after 40 minutes of continuous speaking when temporal smoothing and normalization are applied to the extracted embeddings. We employ support vector machines for classification and achieve accuracy scores of 81% using x-vectors, 85% using ECAPA-TDNN embeddings, and 82% using wav2vec 2.0 embeddings as input features. We obtain an accuracy score of 76%, when the trained system is applied to a different speaker and recording environment without any adaptation.

摘要

嗓音疲劳是指由于长时间使用而导致的声音疲劳和虚弱感。本文研究了神经嵌入在嗓音疲劳检测中的有效性。我们在学术英语口语语料库上比较了x向量、ECAPA - TDNN和wav2vec 2.0嵌入。数据的低维映射表明,神经嵌入捕捉了说话者在长时间使用嗓音期间嗓音特征变化的信息。我们表明,当对提取的嵌入应用时间平滑和归一化后,在连续说话40分钟后,使用所有三种类型的神经嵌入都可以可靠地预测嗓音疲劳。我们使用支持向量机进行分类,以x向量作为输入特征时准确率为81%,使用ECAPA - TDNN嵌入时准确率为85%,使用wav2vec 2.0嵌入时准确率为82%。当将训练好的系统应用于不同的说话者和录音环境而不进行任何调整时,我们获得了76%的准确率。

相似文献

1
Detecting Vocal Fatigue with Neural Embeddings.利用神经嵌入检测嗓音疲劳。
J Voice. 2023 Feb 9. doi: 10.1016/j.jvoice.2023.01.012.
2
Voice disorder discrimination using vowel acoustic measures in female speakers.基于元音声学特征的女性嗓音障碍判别。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):2087-2102. doi: 10.1111/1460-6984.13081. Epub 2024 Jun 17.
3
Is Cepstral Peak Prominence a Measure of Vocal Fatigue in Temple Priests: A Pilot Study.谐波峰值突出度能否作为寺庙神职人员嗓音疲劳的一项指标:一项初步研究。
J Voice. 2023 Mar 5. doi: 10.1016/j.jvoice.2023.01.015.
4
Smartphone Recordings are Comparable to "Gold Standard" Recordings for Acoustic Measurements of Voice.智能手机录音在嗓音声学测量方面可与“金标准”录音相媲美。
J Voice. 2023 Apr 3. doi: 10.1016/j.jvoice.2023.01.031.
5
Voice Parameters in Children With Cochlear Implants: A Systematic Review and Meta-Analysis.人工耳蜗植入儿童的嗓音参数:系统评价与Meta分析
J Voice. 2025 Jul;39(4):1132.e23-1132.e34. doi: 10.1016/j.jvoice.2023.01.022. Epub 2023 Mar 2.
6
Enhancing the Performance of Pathological Voice Quality Assessment System Through the Attention-Mechanism Based Neural Network.
J Voice. 2025 Jul;39(4):1033-1043. doi: 10.1016/j.jvoice.2022.12.026. Epub 2023 Jan 31.
7
Perceptual and Computational Estimates of Vocal Breathiness and Roughness in Sustained Phonation and Connected Speech.持续发声和连贯言语中嗓音呼吸声和粗糙感的感知与计算评估
J Voice. 2025 Jul;39(4):1131.e31-1131.e43. doi: 10.1016/j.jvoice.2023.02.014. Epub 2023 Mar 16.
8
Systematic Review of Literature on Prevalence of Vocal Fatigue Among Teachers.教师嗓音疲劳患病率的文献系统综述
J Voice. 2025 Jan;39(1):105-112. doi: 10.1016/j.jvoice.2022.07.029. Epub 2022 Sep 20.
9
Variability in Voice Characteristics of Female Speakers With Phonotraumatic Vocal Fold Lesions.患有发声创伤性声带病变的女性说话者语音特征的变异性。
J Voice. 2025 Jul;39(4):1118-1129. doi: 10.1016/j.jvoice.2023.01.019. Epub 2023 Feb 20.
10
Effect of Face Mask on Voice Production During COVID-19 Pandemic: A Systematic Review.口罩对 COVID-19 大流行期间声音产生的影响:系统综述。
J Voice. 2024 Mar;38(2):446-457. doi: 10.1016/j.jvoice.2021.09.027. Epub 2021 Oct 11.

引用本文的文献

1
Segmental and Suprasegmental Speech Foundation Models for Classifying Cognitive Risk Factors: Evaluating Out-of-the-Box Performance.用于分类认知风险因素的分段和超分段语音基础模型:评估开箱即用性能
Interspeech. 2024 Sep;2024:917-921. doi: 10.21437/interspeech.2024-2063.
2
Automated Speech Analysis for Risk Detection of Depression, Anxiety, Insomnia, and Fatigue: Algorithm Development and Validation Study.自动化语音分析用于检测抑郁、焦虑、失眠和疲劳的风险:算法开发和验证研究。
J Med Internet Res. 2024 Oct 31;26:e58572. doi: 10.2196/58572.