• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于基频提取和线谱对特征的多重元音修复用于语音障碍。

Multiple Vowels Repair Based on Pitch Extraction and Line Spectrum Pair Feature for Voice Disorder.

出版信息

IEEE J Biomed Health Inform. 2020 Jul;24(7):1940-1951. doi: 10.1109/JBHI.2020.2978103. Epub 2020 Mar 3.

DOI:10.1109/JBHI.2020.2978103
PMID:32149701
Abstract

Individuals, such as voice-related professionals, elderly people and smokers, are increasingly suffering from voice disorder, which implies the importance of pathological voice repair. Previous work on pathological voice repair only concerned about sustained vowel /a/, but multiple vowels repair is still challenging due to the unstable extraction of pitch and the unsatisfactory reconstruction of formant. In this paper, a multiple vowels repair based on pitch extraction and Line Spectrum Pair feature for voice disorder is proposed, which broadened the research subjects of voice repair from only single vowel /a/ to multiple vowels /a/, /i/ and /u/ and achieved the repair of these vowels successfully. Considering deep neural network as a classifier, a voice recognition is performed to classify the normal and pathological voices. Wavelet Transform and Hilbert-Huang Transform are applied for pitch extraction. Based on Line Spectrum Pair (LSP) feature, the formant is reconstructed. The final repaired voice is obtained by synthesizing the pitch and the formant. The proposed method is validated on Saarbrücken Voice Database (SVD) database. The achieved improvements of three metrics, Segmental Signal-to-Noise Ratio, LSP distance measure and Mel cepstral distance measure, are respectively 45.87%, 50.37% and 15.56%. Besides, an intuitive analysis based on spectrogram has been done and a prominent repair effect has been achieved.

摘要

个体,如与声音相关的专业人员、老年人和吸烟者,越来越多地遭受声音障碍,这意味着病理声音修复的重要性。先前的病理声音修复工作仅关注于持续元音 /a/,但由于基频提取不稳定和共振峰重建不理想,多元音修复仍然具有挑战性。在本文中,提出了一种基于基频提取和线谱对特征的用于语音障碍的多元音修复方法,将语音修复的研究对象从单一元音 /a/扩展到了多元音 /a/、/i/和 /u/,并成功地实现了这些元音的修复。考虑到深度神经网络作为分类器,进行语音识别以对正常语音和病理语音进行分类。应用小波变换和希尔伯特-黄变换进行基频提取。基于线谱对(LSP)特征,重建共振峰。通过合成基频和共振峰,得到最终修复的语音。所提出的方法在 Saarbrücken 语音数据库(SVD)上进行了验证。三个指标的改进,即分段信噪比、LSP 距离度量和梅尔倒谱距离度量,分别为 45.87%、50.37%和 15.56%。此外,还进行了基于语谱图的直观分析,并取得了显著的修复效果。

相似文献

1
Multiple Vowels Repair Based on Pitch Extraction and Line Spectrum Pair Feature for Voice Disorder.基于基频提取和线谱对特征的多重元音修复用于语音障碍。
IEEE J Biomed Health Inform. 2020 Jul;24(7):1940-1951. doi: 10.1109/JBHI.2020.2978103. Epub 2020 Mar 3.
2
PVR-AFM: A Pathological Voice Repair System based on Non-linear Structure.PVR-AFM:一种基于非线性结构的病理性嗓音修复系统。
J Voice. 2023 Sep;37(5):648-662. doi: 10.1016/j.jvoice.2021.05.010. Epub 2021 Jul 5.
3
E-DGAN: An Encoder-Decoder Generative Adversarial Network Based Method for Pathological to Normal Voice Conversion.E-DGAN:一种基于编解码器生成对抗网络的病理语音到正常语音转换方法。
IEEE J Biomed Health Inform. 2023 May;27(5):2489-2500. doi: 10.1109/JBHI.2023.3239551. Epub 2023 May 4.
4
The Use of Arabic Vowels to Model the Pathological Effect of Influenza Disease by Wavelets.利用阿拉伯语元音通过小波来模拟流感疾病的病理影响。
Comput Math Methods Med. 2019 Dec 4;2019:4198462. doi: 10.1155/2019/4198462. eCollection 2019.
5
Support vector wavelet adaptation for pathological voice assessment.支持向量小波自适应用于病理嗓音评估。
Comput Biol Med. 2011 Sep;41(9):822-8. doi: 10.1016/j.compbiomed.2011.06.019. Epub 2011 Jul 20.
6
Robustness of auditory Teager Energy Cepstrum Coefficients for classification of pathological and normal voices in noisy environments.听觉Teager能量倒谱系数在噪声环境中对病理性嗓音和正常嗓音分类的稳健性
ScientificWorldJournal. 2013 May 28;2013:435729. doi: 10.1155/2013/435729. Print 2013.
7
An Acoustic-Signal-Based Preventive Program for University Lecturers' Vocal Health.基于声信号的高校教师嗓音保健预防方案
J Voice. 2020 Jan;34(1):88-99. doi: 10.1016/j.jvoice.2018.05.011. Epub 2018 Jul 31.
8
Fuzzy logic based classification and assessment of pathological voice signals.基于模糊逻辑的病理性语音信号分类与评估
Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:328-31. doi: 10.1109/IEMBS.2009.5333867.
9
Using modulation spectra for voice pathology detection and classification.利用调制谱进行语音病理学检测与分类。
Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:2514-7. doi: 10.1109/IEMBS.2009.5334850.
10
Acoustic recognition of voice disorders: a comparative study of running speech versus sustained vowels.嗓音障碍的声学识别:连续语音与持续元音的对比研究
J Acoust Soc Am. 1990 May;87(5):2218-24. doi: 10.1121/1.399189.

引用本文的文献

1
Comparative Analysis of CNN and RNN for Voice Pathology Detection.卷积神经网络(CNN)和循环神经网络(RNN)在语音病理学检测中的比较分析。
Biomed Res Int. 2021 Apr 14;2021:6635964. doi: 10.1155/2021/6635964. eCollection 2021.